SlideShare a Scribd company logo
Nikolay Malitsky
Approaching the Fifth Paradigm
#SAISEco4
Outline
 Paradigm Shift
 Spark-MPI Approach
 MPI-Based Deep Learning Applications
 Next: Reinforcement Learning Applications
2#SAISEco4
Four Science Paradigms*
1. Experimental: describe empirical facts and test hypotheses
since: thousand years ago
2. Theoretical: explain and predict natural phenomena using
models and abstractions
since: several hundred years ago
3. Computational: simulate theoretical models using computers
since: second half of the 20th century
4. Data-Intensive: scientific discoveries based on Big Data analytics
since: around 15 years ago
3
*Jim Gray and Alex Szalay, eScience – A Transformed Scientific Method, NRC-CSTB, 2007
#SAISEco4
Paradigm Shift
4
▪ The fourth paradigm of data-intensive science rapidly became a major conceptual
approach for multiple application domains encompassing and generating large-
scale scientific drivers such as fusion reactors and light source facilities.
▪ The success of data-intensive projects subsequently triggered an explosion
of numerous machine learning approaches addressing a wide range of
industrial and scientific applications such as computer vision, self-driving
cars, and brain modelling.
▪ The next generation of artificial intelligent systems clearly represents
a paradigm shift from data processing pipelines towards cognitive
knowledge-centric applications.
4th Paradigm:
Data-Intensive
Science3rd Paradigm:
Computational
Science
5th Paradigm:
Cognitive Computing
DeepMind AlphaGo
IBM Watson DeepQA
Human Brain Project
▪ As shown in Fig. 1, AI systems broke the boundaries of computational and
data-intensive paradigms and began to form a new ecosystem by merging and
extending existing technologies.
Figure 1: The Fifth Paradigm*
#SAISEco4
*N. Malitsky, R. Castain, and M. Cowan, Spark-MPI: Approaching the Fifth Paradigm of Cognitive Applications, arXiv:1806.01110, 2018
Knowledge
5
▪ In his original talk, Jim Gray discussed “objectifying” knowledge within the field of ontology for
providing a structured representation of abstract concepts and physical entities. This direction is
related with the development of structured knowledge bases and associated technologies such as the
Semantic Web and Linked Data.
▪ Existing structured resources however only capture a tiny subset of available information. Therefore,
advanced question-answering (QA) systems* augmented them with corpora of raw text and processing
pipelines consisting of multiple stages that combine hundreds of different cooperating algorithms from
various fields.
As a result, emerging AI-oriented applications imply a more general and practical knowledge definition:
Knowledge is a multifacet substance distributed among heterogeneous information networks and
associated processing platforms. The structure and relationship between different components of such
a composite representation is dynamic, continuously shaped and consolidated by machine learning
processes.
*D. A. Ferrucci, Introduction to “This is Watson”, IBM Journal of Research and Development, 2012
#SAISEco4
From Processing Pipelines to Rational Agents
6
Data-intensive
processing pipelines
Deep learning
model-centric applications
W
W
W
D
D
D
O
D
D
D
W
W
W
M
W
W
W
D
D
D
O
Reinforcement learning
agent-oriented applications
D
D
D
W
W
W
M
#SAISEco4
Approaching the Fifth Paradigm of Cognitive Applications
7
*Dharshan Kumaran, Demis Hassabis, and James L. McClelland, What Learning Systems do Intelligent Agents Need?
Complementary Learning Systems, Trends in Cognitive Sciences, 2016
Neocortex /
Heterogeneous Knowledge
and Information Network
Hippocampus /
Streaming Pipeline
Figure 2: Complementary Learning Systems*
The consolidation of HPC and Big Data
machine learning technologies
represents the prerequisite for
developing the next paradigm of
cognitive applications
4th Paradigm:
Data-Intensive
Science3rd Paradigm:
Computational
Science
5th Paradigm:
Cognitive Applications
Figure 1: The Fifth Paradigm
#SAISEco4
Spark-MPI Approach
8#SAISEco4
Closing the gap between Big Data and HPC computing
9
*Geoffrey Fox et al. HPC-ABDC High Performance Computing Enhanced Apache Big Data Stack, CCGrid, 2015
Spark MPI
Ecosystems*: Big Data HPC Computing
New Frontiers
#SAISEco4
MPI: Message Passing Interface
10
Application Programming Interface:
▪ peer-to-peer: allreduce
▪ master-workers: scatter, gather, reduce
▪ point-to-point: send, receive
▪ remote memory access: put, get
Portable Access Layer for various communication protocols:
Process Management Interface:
▪ RDMA
▪ GPUDirect RDMA
▪ TCP/IP
▪ shared memory
▪ address exchange service
▪ …
#SAISEco4
PMI-based Spark-MPI Approach
11
Spark
Driver
PMI
Server
Spark
Worker
Spark
Worker
Spark
Worker
Spark
Worker
Spark driver-worker
PMI server-worker
MPI inter-worker
Interfaces
▪ PMI-Exascale (PMIx): created by the Open
MPI team3 in response to the ever-increasing
scale of supercomputing clusters.
(3) R. Castain, D. Solt, J. Hursey, and A. Bouteiller, PMIx: Process Management for Exascale Environment, 2017
(2) P. Balaji et al. PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems, 2010
▪ Process Management Interface (PMI):
originally developed by the MPICH team2
and used for exchanging wireup information
among processes.
#SAISEco4
▪ The PMIx community has therefore focused on extending
the earlier PMI work, adding flexibility to existing APIs (e.g.,
to support asynchronous operations) as well as new APIs
that broaden the range of interactions with the resident
resource manager.
(1) N. Malitsky et al. Building Near-Real-Time Processing Pipelines with the Spark-MPI platform, NYSDS, 2017
▪ Spark-MPI1 encompasses three interfaces. Specifically, it
complements the Spark conventional driver-worker model
with the PMI server-worker interface for establishing MPI
inter-worker communications.
Open MPI*
12
Open MPI was derived as a generalization of four projects bringing together over 40 frameworks. It introduced
a Modular Component Architecture (MCA) that utilized components (a.k.a. plugins) to provide alternative
implementations of key functional blocks such as message transport, mapping, algorithms, and collective
operations.
*E.Gabriel, G.E. Fagg, G. Bosilca, T. Anhskun, J. J. Dongarra, J. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R. H. Castain,
D. J. Daniel, R. L. Graham, and T. S. Woodall, Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation, 2004
Framework
Base
component
…
component
MPI application
Modular Component Architecture (MCA)
Framework
Base
component
…
component
…
Architecture
MPI application
Open MPI core (OPAL, ORTE, and OMPI layers)
sparkmpi
Base
default
…
Base
tcp
ofi
MPI byte
transfer layer
(btl)
smcuda
…
…
OpenRTE Daemon’s
Launch Subsystem
(odls)
Implementation
#SAISEco4
Spark-MPI Integrated Platform
13
N. Malitsky, R. Castain, and M. Cowan, Spark-MPI: Approaching the Fifth Paradigm of Cognitive Applications, arXiv:1806.01110, 2018
MPI-Based Algorithms
Process Management Interface (PMI)
Connectors
Resilient Distributed Dataset API
SLURM
Parallel
File Systems
Spark Platform HPC Extensions
Receivers
Streaming
Sources
#SAISEco4
MPI-Based Deep Learning Applications
14#SAISEco4
Deep Learning Training as a Third Paradigm
Computational Application
15
W
W W
W
P
Parameter Server-based Data Parallel Model*
P: Parameter Server
*Abadi et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, 2015
W
W
W
W
All-Reduce Model
W: DL Worker
#SAISEco4
(Some of the) MPI DL Projects
16
▪ CNTK1: Microsoft Cognitive Toolkit
▪ TensorFlow-Matex2: added two new TensorFlow operators, Global_Broadcast and
MPI_Allreduce
▪ S-Caffe3: scaled Caffe with the MPI level hierarchical reduction design
▪ Horovod4: adopted Baidu’s approach based on the ring-allreduce algorithm and further
developed its implementation with NVIDIA’s NCCL library for collective implementation
(1) A. Agarwal et al. An Introduction to Computational Networks and Computational Network Toolkit, 2014
(2) A. Vishnu et al. User-transparent distributed TensorFlow, 2017
(3) A. A. Awan et al. S-Caffe: co-designing MPI runtime and Caffe for scalable deep learning on modern GPU clusters, 2017
(4) A. Segeev and M. Del Balso. Horovod: fast and easy distributed deep learning in TensorFlow, 2018
(5) P. Mendygral. Scaling Deep Learning, 2018
▪ CPE ML Plugin5: Cray Programming Environment Machine Learning Plugin
#SAISEco4
Spark-MPI-Horovod
17
Crate the TF optimizer
Wrap TF with Horovod
Run the Horovod training
on Spark workers
Initialize Horovod and MPI
Extract the MNIST dataset
Build the DL model
…
Run the Horovod MPI-
based training
Initialize the PMI environmental variables
The Horovod MPI-based training framework replaces
the TensorFlow parameter servers with the ring-
allreduce approach for averaging gradients among
TensorFlow workers.
For users, the corresponding integration consists of two
primary steps as illustrated by the script: (1) initializing
Horovod with hvd.init() and (2) wrapping TensorFlow
worker’s optimizer with hvd.DistributedOptimizer().
The Spark-MPI pipelines enable to process the Horovod
training on Spark workers with Map operations. To
establish MPI communication among the Spark workers,
the Map operation (e.g. train()) needs only to define
PMI-related environmental variables (such as
PMIX_RANK and a port number).
#SAISEco4
Next
18#SAISEco4
Deep Reinforcement Learning
19
**R. Nishihara et. al. Real-Time Machine Learning: The Missing Pieces, arXiv 1703.03924, 2017
*A. Nair et al. Massively Parallel Methods for Deep Reinforcement Learning, ICML, 2015
Environment Actor
Parameter
Server
DQN Learner
Replay
Memory
(s,a,r,s’)
(s,a,r,s’)
Agent
argmaxa Q(s, a; q)
(r,s’)
Figure 1: Gorila* (General Reinforcement Learning Architecture)
System Requirements**:
• Low latency
• High throughput
• Dynamic task creation
• Heterogeneous tasks
• Arbitrary dataflow dependencies
• Transparent fault tolerance
• Debuggability and profiling
#SAISEco4
(Some of the) RL Applications*
20
(2) D. Silver et al. Mastering the game of Go with deep neutral networks an tree search, Nature, 2016
(1) V. Mnih et al. Playing Atari with Deep Reinforcement Learning, NIPS, 2013
▪ Atari Games1
▪ AlphaGo2
▪ Robotics
▪ Self-driving vehicles
▪ Autonomous UAVs
…
“Pterodactylus antiquus, the first pterosaur
species to be named and identified as a flying
reptile … 150.8–148.5 million years ago”
(Wikipedia)
range
#SAISEco4
Summary
21
▪ Emerging AI projects represent a paradigm shift from data processing pipelines towards
the fifth paradigm of cognitive knowledge-centric applications.
▪ The new generation of AI composite applications requires the integration of Big Data and
HPC technologies. For example, MPI was originally introduced within the computational
paradigm ecosystem for developing HPC scientific applications. But recently, MPI was
successfully applied for extending the scale of deep learning applications.
▪ Knowledge is a multifacet substance distributed among heterogeneous information
networks and associated processing platforms. The structure and relationship between
different components is dynamic, continuously shaped and consolidated by machine
learning processes.
▪ Spark-MPI addresses this strategic direction by extending the Spark platform with MPI-
based HPC applications using the Process Management Interface (PMI).
#SAISEco4

More Related Content

What's hot

A location based least-cost scheduling for data-intensive applications
A location based least-cost scheduling for data-intensive applicationsA location based least-cost scheduling for data-intensive applications
A location based least-cost scheduling for data-intensive applicationsIAEME Publication
 
CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse
vty
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
vty
 
External controlled vocabularies support in Dataverse
External controlled vocabularies support in DataverseExternal controlled vocabularies support in Dataverse
External controlled vocabularies support in Dataverse
vty
 
Collective Mind: a collaborative curation tool for program optimization
Collective Mind: a collaborative curation tool for program optimizationCollective Mind: a collaborative curation tool for program optimization
Collective Mind: a collaborative curation tool for program optimization
Grigori Fursin
 
Primers or Reminders? The Effects of Existing Review Comments on Code Review
Primers or Reminders? The Effects of Existing Review Comments on Code ReviewPrimers or Reminders? The Effects of Existing Review Comments on Code Review
Primers or Reminders? The Effects of Existing Review Comments on Code Review
Delft University of Technology
 
Data for Science Service Portfolio
Data for Science Service PortfolioData for Science Service Portfolio
Data for Science Service Portfolio
EUDAT
 
Thesies_Cheng_Guo_2015_fina_signed
Thesies_Cheng_Guo_2015_fina_signedThesies_Cheng_Guo_2015_fina_signed
Thesies_Cheng_Guo_2015_fina_signedCheng Guo
 
Benchmarking open source deep learning frameworks
Benchmarking open source deep learning frameworksBenchmarking open source deep learning frameworks
Benchmarking open source deep learning frameworks
IJECEIAES
 

What's hot (10)

A location based least-cost scheduling for data-intensive applications
A location based least-cost scheduling for data-intensive applicationsA location based least-cost scheduling for data-intensive applications
A location based least-cost scheduling for data-intensive applications
 
CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse
 
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in DataverseClariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
Clariah Tech Day: Controlled Vocabularies and Ontologies in Dataverse
 
External controlled vocabularies support in Dataverse
External controlled vocabularies support in DataverseExternal controlled vocabularies support in Dataverse
External controlled vocabularies support in Dataverse
 
Collective Mind: a collaborative curation tool for program optimization
Collective Mind: a collaborative curation tool for program optimizationCollective Mind: a collaborative curation tool for program optimization
Collective Mind: a collaborative curation tool for program optimization
 
Primers or Reminders? The Effects of Existing Review Comments on Code Review
Primers or Reminders? The Effects of Existing Review Comments on Code ReviewPrimers or Reminders? The Effects of Existing Review Comments on Code Review
Primers or Reminders? The Effects of Existing Review Comments on Code Review
 
NOGESI case study
NOGESI case studyNOGESI case study
NOGESI case study
 
Data for Science Service Portfolio
Data for Science Service PortfolioData for Science Service Portfolio
Data for Science Service Portfolio
 
Thesies_Cheng_Guo_2015_fina_signed
Thesies_Cheng_Guo_2015_fina_signedThesies_Cheng_Guo_2015_fina_signed
Thesies_Cheng_Guo_2015_fina_signed
 
Benchmarking open source deep learning frameworks
Benchmarking open source deep learning frameworksBenchmarking open source deep learning frameworks
Benchmarking open source deep learning frameworks
 

Similar to Spark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky

OpenACC Monthly Highlights: January 2024
OpenACC Monthly Highlights: January 2024OpenACC Monthly Highlights: January 2024
OpenACC Monthly Highlights: January 2024
OpenACC
 
Enabling Application Integrated Proactive Fault Tolerance
Enabling Application Integrated Proactive Fault ToleranceEnabling Application Integrated Proactive Fault Tolerance
Enabling Application Integrated Proactive Fault Tolerance
Dai Yang
 
2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf
LevLafayette1
 
Programming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi ClustersProgramming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi Clusters
AM Publications
 
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC
 
An optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computingAn optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computing
DIGVIJAY SHINDE
 
Seminar VU Amsterdam 2015
Seminar VU Amsterdam 2015Seminar VU Amsterdam 2015
Seminar VU Amsterdam 2015
Philipp Leitner
 
Netsoft19 Keynote: Fluid Network Planes
Netsoft19 Keynote: Fluid Network PlanesNetsoft19 Keynote: Fluid Network Planes
Netsoft19 Keynote: Fluid Network Planes
Christian Esteve Rothenberg
 
Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...
Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...
Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...
Spark Summit
 
AN EMPIRICAL STUDY OF USING CLOUD-BASED SERVICES IN CAPSTONE PROJECT DEVELOPMENT
AN EMPIRICAL STUDY OF USING CLOUD-BASED SERVICES IN CAPSTONE PROJECT DEVELOPMENTAN EMPIRICAL STUDY OF USING CLOUD-BASED SERVICES IN CAPSTONE PROJECT DEVELOPMENT
AN EMPIRICAL STUDY OF USING CLOUD-BASED SERVICES IN CAPSTONE PROJECT DEVELOPMENT
csandit
 
CSE NEW_4th yr w.e.f. 2018-19.pdf
CSE NEW_4th yr w.e.f. 2018-19.pdfCSE NEW_4th yr w.e.f. 2018-19.pdf
CSE NEW_4th yr w.e.f. 2018-19.pdf
ssuser5a7261
 
OpenACC and Hackathons Monthly Highlights: April 2023
OpenACC and Hackathons Monthly Highlights: April  2023OpenACC and Hackathons Monthly Highlights: April  2023
OpenACC and Hackathons Monthly Highlights: April 2023
OpenACC
 
Closer17.ppt
Closer17.pptCloser17.ppt
Closer17.ppt
Ptidej Team
 
Closer17.ppt
Closer17.pptCloser17.ppt
Duc le CV
Duc le CVDuc le CV
Duc le CV
Duc Minh Le
 
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKMACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
Abhi Jit
 
UberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for BeginnersUberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for Beginners
hpcexperiment
 
PlanetData: Consuming Structured Data at Web Scale
PlanetData: Consuming Structured Data at Web ScalePlanetData: Consuming Structured Data at Web Scale
PlanetData: Consuming Structured Data at Web Scale
PlanetData Network of Excellence
 
Proof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityProof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics Interoperability
Open Cyber University of Korea
 

Similar to Spark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky (20)

OpenACC Monthly Highlights: January 2024
OpenACC Monthly Highlights: January 2024OpenACC Monthly Highlights: January 2024
OpenACC Monthly Highlights: January 2024
 
Enabling Application Integrated Proactive Fault Tolerance
Enabling Application Integrated Proactive Fault ToleranceEnabling Application Integrated Proactive Fault Tolerance
Enabling Application Integrated Proactive Fault Tolerance
 
2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf
 
Programming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi ClustersProgramming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi Clusters
 
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
 
An optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computingAn optimized scientific workflow scheduling in cloud computing
An optimized scientific workflow scheduling in cloud computing
 
Seminar VU Amsterdam 2015
Seminar VU Amsterdam 2015Seminar VU Amsterdam 2015
Seminar VU Amsterdam 2015
 
Netsoft19 Keynote: Fluid Network Planes
Netsoft19 Keynote: Fluid Network PlanesNetsoft19 Keynote: Fluid Network Planes
Netsoft19 Keynote: Fluid Network Planes
 
Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...
Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...
Bringing HPC Algorithms to Big Data Platforms: Spark Summit East talk by Niko...
 
AN EMPIRICAL STUDY OF USING CLOUD-BASED SERVICES IN CAPSTONE PROJECT DEVELOPMENT
AN EMPIRICAL STUDY OF USING CLOUD-BASED SERVICES IN CAPSTONE PROJECT DEVELOPMENTAN EMPIRICAL STUDY OF USING CLOUD-BASED SERVICES IN CAPSTONE PROJECT DEVELOPMENT
AN EMPIRICAL STUDY OF USING CLOUD-BASED SERVICES IN CAPSTONE PROJECT DEVELOPMENT
 
CSE NEW_4th yr w.e.f. 2018-19.pdf
CSE NEW_4th yr w.e.f. 2018-19.pdfCSE NEW_4th yr w.e.f. 2018-19.pdf
CSE NEW_4th yr w.e.f. 2018-19.pdf
 
OpenACC and Hackathons Monthly Highlights: April 2023
OpenACC and Hackathons Monthly Highlights: April  2023OpenACC and Hackathons Monthly Highlights: April  2023
OpenACC and Hackathons Monthly Highlights: April 2023
 
Closer17.ppt
Closer17.pptCloser17.ppt
Closer17.ppt
 
Closer17.ppt
Closer17.pptCloser17.ppt
Closer17.ppt
 
Duc le CV
Duc le CVDuc le CV
Duc le CV
 
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORKMACHINE LEARNING ON MAPREDUCE FRAMEWORK
MACHINE LEARNING ON MAPREDUCE FRAMEWORK
 
UberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for BeginnersUberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for Beginners
 
PlanetData: Consuming Structured Data at Web Scale
PlanetData: Consuming Structured Data at Web ScalePlanetData: Consuming Structured Data at Web Scale
PlanetData: Consuming Structured Data at Web Scale
 
Planetdata simpda
Planetdata simpdaPlanetdata simpda
Planetdata simpda
 
Proof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityProof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics Interoperability
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
Databricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
Databricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
Databricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Databricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
Databricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Databricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
Databricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Databricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
Databricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Databricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
Databricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
Databricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Databricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
Oppotus
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 

Recently uploaded (20)

原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 

Spark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky

  • 1. Nikolay Malitsky Approaching the Fifth Paradigm #SAISEco4
  • 2. Outline  Paradigm Shift  Spark-MPI Approach  MPI-Based Deep Learning Applications  Next: Reinforcement Learning Applications 2#SAISEco4
  • 3. Four Science Paradigms* 1. Experimental: describe empirical facts and test hypotheses since: thousand years ago 2. Theoretical: explain and predict natural phenomena using models and abstractions since: several hundred years ago 3. Computational: simulate theoretical models using computers since: second half of the 20th century 4. Data-Intensive: scientific discoveries based on Big Data analytics since: around 15 years ago 3 *Jim Gray and Alex Szalay, eScience – A Transformed Scientific Method, NRC-CSTB, 2007 #SAISEco4
  • 4. Paradigm Shift 4 ▪ The fourth paradigm of data-intensive science rapidly became a major conceptual approach for multiple application domains encompassing and generating large- scale scientific drivers such as fusion reactors and light source facilities. ▪ The success of data-intensive projects subsequently triggered an explosion of numerous machine learning approaches addressing a wide range of industrial and scientific applications such as computer vision, self-driving cars, and brain modelling. ▪ The next generation of artificial intelligent systems clearly represents a paradigm shift from data processing pipelines towards cognitive knowledge-centric applications. 4th Paradigm: Data-Intensive Science3rd Paradigm: Computational Science 5th Paradigm: Cognitive Computing DeepMind AlphaGo IBM Watson DeepQA Human Brain Project ▪ As shown in Fig. 1, AI systems broke the boundaries of computational and data-intensive paradigms and began to form a new ecosystem by merging and extending existing technologies. Figure 1: The Fifth Paradigm* #SAISEco4 *N. Malitsky, R. Castain, and M. Cowan, Spark-MPI: Approaching the Fifth Paradigm of Cognitive Applications, arXiv:1806.01110, 2018
  • 5. Knowledge 5 ▪ In his original talk, Jim Gray discussed “objectifying” knowledge within the field of ontology for providing a structured representation of abstract concepts and physical entities. This direction is related with the development of structured knowledge bases and associated technologies such as the Semantic Web and Linked Data. ▪ Existing structured resources however only capture a tiny subset of available information. Therefore, advanced question-answering (QA) systems* augmented them with corpora of raw text and processing pipelines consisting of multiple stages that combine hundreds of different cooperating algorithms from various fields. As a result, emerging AI-oriented applications imply a more general and practical knowledge definition: Knowledge is a multifacet substance distributed among heterogeneous information networks and associated processing platforms. The structure and relationship between different components of such a composite representation is dynamic, continuously shaped and consolidated by machine learning processes. *D. A. Ferrucci, Introduction to “This is Watson”, IBM Journal of Research and Development, 2012 #SAISEco4
  • 6. From Processing Pipelines to Rational Agents 6 Data-intensive processing pipelines Deep learning model-centric applications W W W D D D O D D D W W W M W W W D D D O Reinforcement learning agent-oriented applications D D D W W W M #SAISEco4
  • 7. Approaching the Fifth Paradigm of Cognitive Applications 7 *Dharshan Kumaran, Demis Hassabis, and James L. McClelland, What Learning Systems do Intelligent Agents Need? Complementary Learning Systems, Trends in Cognitive Sciences, 2016 Neocortex / Heterogeneous Knowledge and Information Network Hippocampus / Streaming Pipeline Figure 2: Complementary Learning Systems* The consolidation of HPC and Big Data machine learning technologies represents the prerequisite for developing the next paradigm of cognitive applications 4th Paradigm: Data-Intensive Science3rd Paradigm: Computational Science 5th Paradigm: Cognitive Applications Figure 1: The Fifth Paradigm #SAISEco4
  • 9. Closing the gap between Big Data and HPC computing 9 *Geoffrey Fox et al. HPC-ABDC High Performance Computing Enhanced Apache Big Data Stack, CCGrid, 2015 Spark MPI Ecosystems*: Big Data HPC Computing New Frontiers #SAISEco4
  • 10. MPI: Message Passing Interface 10 Application Programming Interface: ▪ peer-to-peer: allreduce ▪ master-workers: scatter, gather, reduce ▪ point-to-point: send, receive ▪ remote memory access: put, get Portable Access Layer for various communication protocols: Process Management Interface: ▪ RDMA ▪ GPUDirect RDMA ▪ TCP/IP ▪ shared memory ▪ address exchange service ▪ … #SAISEco4
  • 11. PMI-based Spark-MPI Approach 11 Spark Driver PMI Server Spark Worker Spark Worker Spark Worker Spark Worker Spark driver-worker PMI server-worker MPI inter-worker Interfaces ▪ PMI-Exascale (PMIx): created by the Open MPI team3 in response to the ever-increasing scale of supercomputing clusters. (3) R. Castain, D. Solt, J. Hursey, and A. Bouteiller, PMIx: Process Management for Exascale Environment, 2017 (2) P. Balaji et al. PMI: A Scalable Parallel Process-Management Interface for Extreme-Scale Systems, 2010 ▪ Process Management Interface (PMI): originally developed by the MPICH team2 and used for exchanging wireup information among processes. #SAISEco4 ▪ The PMIx community has therefore focused on extending the earlier PMI work, adding flexibility to existing APIs (e.g., to support asynchronous operations) as well as new APIs that broaden the range of interactions with the resident resource manager. (1) N. Malitsky et al. Building Near-Real-Time Processing Pipelines with the Spark-MPI platform, NYSDS, 2017 ▪ Spark-MPI1 encompasses three interfaces. Specifically, it complements the Spark conventional driver-worker model with the PMI server-worker interface for establishing MPI inter-worker communications.
  • 12. Open MPI* 12 Open MPI was derived as a generalization of four projects bringing together over 40 frameworks. It introduced a Modular Component Architecture (MCA) that utilized components (a.k.a. plugins) to provide alternative implementations of key functional blocks such as message transport, mapping, algorithms, and collective operations. *E.Gabriel, G.E. Fagg, G. Bosilca, T. Anhskun, J. J. Dongarra, J. Squyres, V. Sahay, P. Kambadur, B. Barrett, A. Lumsdaine, R. H. Castain, D. J. Daniel, R. L. Graham, and T. S. Woodall, Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation, 2004 Framework Base component … component MPI application Modular Component Architecture (MCA) Framework Base component … component … Architecture MPI application Open MPI core (OPAL, ORTE, and OMPI layers) sparkmpi Base default … Base tcp ofi MPI byte transfer layer (btl) smcuda … … OpenRTE Daemon’s Launch Subsystem (odls) Implementation #SAISEco4
  • 13. Spark-MPI Integrated Platform 13 N. Malitsky, R. Castain, and M. Cowan, Spark-MPI: Approaching the Fifth Paradigm of Cognitive Applications, arXiv:1806.01110, 2018 MPI-Based Algorithms Process Management Interface (PMI) Connectors Resilient Distributed Dataset API SLURM Parallel File Systems Spark Platform HPC Extensions Receivers Streaming Sources #SAISEco4
  • 14. MPI-Based Deep Learning Applications 14#SAISEco4
  • 15. Deep Learning Training as a Third Paradigm Computational Application 15 W W W W P Parameter Server-based Data Parallel Model* P: Parameter Server *Abadi et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, 2015 W W W W All-Reduce Model W: DL Worker #SAISEco4
  • 16. (Some of the) MPI DL Projects 16 ▪ CNTK1: Microsoft Cognitive Toolkit ▪ TensorFlow-Matex2: added two new TensorFlow operators, Global_Broadcast and MPI_Allreduce ▪ S-Caffe3: scaled Caffe with the MPI level hierarchical reduction design ▪ Horovod4: adopted Baidu’s approach based on the ring-allreduce algorithm and further developed its implementation with NVIDIA’s NCCL library for collective implementation (1) A. Agarwal et al. An Introduction to Computational Networks and Computational Network Toolkit, 2014 (2) A. Vishnu et al. User-transparent distributed TensorFlow, 2017 (3) A. A. Awan et al. S-Caffe: co-designing MPI runtime and Caffe for scalable deep learning on modern GPU clusters, 2017 (4) A. Segeev and M. Del Balso. Horovod: fast and easy distributed deep learning in TensorFlow, 2018 (5) P. Mendygral. Scaling Deep Learning, 2018 ▪ CPE ML Plugin5: Cray Programming Environment Machine Learning Plugin #SAISEco4
  • 17. Spark-MPI-Horovod 17 Crate the TF optimizer Wrap TF with Horovod Run the Horovod training on Spark workers Initialize Horovod and MPI Extract the MNIST dataset Build the DL model … Run the Horovod MPI- based training Initialize the PMI environmental variables The Horovod MPI-based training framework replaces the TensorFlow parameter servers with the ring- allreduce approach for averaging gradients among TensorFlow workers. For users, the corresponding integration consists of two primary steps as illustrated by the script: (1) initializing Horovod with hvd.init() and (2) wrapping TensorFlow worker’s optimizer with hvd.DistributedOptimizer(). The Spark-MPI pipelines enable to process the Horovod training on Spark workers with Map operations. To establish MPI communication among the Spark workers, the Map operation (e.g. train()) needs only to define PMI-related environmental variables (such as PMIX_RANK and a port number). #SAISEco4
  • 19. Deep Reinforcement Learning 19 **R. Nishihara et. al. Real-Time Machine Learning: The Missing Pieces, arXiv 1703.03924, 2017 *A. Nair et al. Massively Parallel Methods for Deep Reinforcement Learning, ICML, 2015 Environment Actor Parameter Server DQN Learner Replay Memory (s,a,r,s’) (s,a,r,s’) Agent argmaxa Q(s, a; q) (r,s’) Figure 1: Gorila* (General Reinforcement Learning Architecture) System Requirements**: • Low latency • High throughput • Dynamic task creation • Heterogeneous tasks • Arbitrary dataflow dependencies • Transparent fault tolerance • Debuggability and profiling #SAISEco4
  • 20. (Some of the) RL Applications* 20 (2) D. Silver et al. Mastering the game of Go with deep neutral networks an tree search, Nature, 2016 (1) V. Mnih et al. Playing Atari with Deep Reinforcement Learning, NIPS, 2013 ▪ Atari Games1 ▪ AlphaGo2 ▪ Robotics ▪ Self-driving vehicles ▪ Autonomous UAVs … “Pterodactylus antiquus, the first pterosaur species to be named and identified as a flying reptile … 150.8–148.5 million years ago” (Wikipedia) range #SAISEco4
  • 21. Summary 21 ▪ Emerging AI projects represent a paradigm shift from data processing pipelines towards the fifth paradigm of cognitive knowledge-centric applications. ▪ The new generation of AI composite applications requires the integration of Big Data and HPC technologies. For example, MPI was originally introduced within the computational paradigm ecosystem for developing HPC scientific applications. But recently, MPI was successfully applied for extending the scale of deep learning applications. ▪ Knowledge is a multifacet substance distributed among heterogeneous information networks and associated processing platforms. The structure and relationship between different components is dynamic, continuously shaped and consolidated by machine learning processes. ▪ Spark-MPI addresses this strategic direction by extending the Spark platform with MPI- based HPC applications using the Process Management Interface (PMI). #SAISEco4