SlideShare a Scribd company logo
1 of 37
Download to read offline
Towards Unification of HPC and Big
Data Paradigms
Universidad Complutense de Madrid
Conferencias de Postgrado
Computer Science and Engineering Department
University Carlos III of Madrid
Jesús Carretero
jcarrete@inf.uc3m.es
2
q Inference Spiral of System Science
q As models become more complex and new data bring in more information, we
require ever increasing computational resources
University Carlos III of Madrid
Science research is changing
3
Who is generating Big Data
Social media and networks
(all of us are generating data)
Scientific instruments
(collecting all sorts of data)
Mobile devices
(tracking all objects all the time)
Sensor technology and
networks
(measuring all kinds of data)
Companies and e-commerce
(Collecting and warehousing data)
University Carlos III of Madrid
4
q Simulation has become the way to research and develop new scientific
and engineering solutions.
q Used nowadays in leading science domains like aerospace industry,
astrophysics, etc.
q Challenges related to the complexity, scalability and data production of
the simulators arise.
q Impact on the relaying IT infrastructure.
University Carlos III of Madrid
Parallel applications require more data everyday
…
5
IoT: the paradigmatic challenge
University Carlos III of Madrid
q The progress and innovation is no longer hindered by the ability to collect data
q But, by the ability to manage, analyze, summarize, visualize, and discover
knowledge from the collected data in a timely manner and in a scalable fashion
6
q Cross fertilization among High Performance Computing (HPC),
large scale distributed systems, and big data management is
needed.
q Mechanisms should be valid for HPC, HTC and workflows …
q Data will play an increasingly substantial role in the near future
q Huge amounts of data produced by real-world devices, applications and
systems (checkpoint, monitoring, …)
Cross fertilization needed
University Carlos III of Madrid
7
q HPC Simulations and data
q Challenges related to the complexity, scalability and data production of the
simulators arise.
qHigh-Performance data analytics (HPDA)
qMore input data (ingestion)
qMore output data for integration/analysis
qReal time, near-real time requirements
University Carlos III of Madrid
Areas of convergence
8
q Systems are expensive and not integrating misses opportunities
q Leveraging investments and purchasing power
q Integration of Computation and Observation cycles implicitly requires
convergence
q Expanded cross disciplinary teams of researchers are needed to explore
the most challenging problems for society
q Data Consolidation trends span Big Data and HPC
q Categorization of Data
q Structured, Semi-structured and Unstructured Data
q Computer Generated and Observed Data
University Carlos III of Madrid
HPC-BD convergence motivation
9
Context
HPC BIG DATA
q Focus: large volumes of
loosely-coupled tasks.
q Architecture: co-located
computation and data,
elasticity is required.
University Carlos III of Madrid
q Focus: CPU-intensive
tightly-coupled applications
q Architecture: compute and
storage are decoupled, high-
speed interconnections.
HPC-Big Data convergence is a must
q Data-intensive scientic computing
q High-performance data analytics
q Convergence at the infrastructure layer
q virtualisation for HPC, deeper storage hierarchy, …
10
HPC and Big Data models
q HPC requires
Computing-Centric
Models (CCM)
q Big Data requires Data-
Centric Models (DCM)
University Carlos III of Madrid
11
Platforms & paradigms
Physical or virtual
q Clusters and
supercomputers
qHPC and supercomputing
q Clouds
qVirtualized resources
qHigher-level model
University Carlos III of Madrid
General or specific
q Processing paradigms
q Open MP and MPI
qCollective model (PGAs,…)
q MapReduce model
q Iterative MapReduce model
q DAG model
q Graph model
12 University Carlos III of Madrid
Data analytics and computing ecosystem compared
Daniel A. Reed And Jack Dongarra. Exascale Computing and Big Data.Communications Of The Acm. 58(1). July 2015. 7
13
q HPC system
University Carlos III of Madrid
Non-Convergent system architectures
q Big Data Platforms
Compute
farm
Storage
farm
High speed
network Network
Local disk
processor
Virtualized resourcesPhysical resources
14
q Traditional approach: open loop
University Carlos III of Madrid
Integration of computation and observation
Simulationdata
On-line
analytics
results
q Desired approach: closed loop
results
visualization
Simulationdata
Off-line
analytics
results
15
q Integrate the platform layer and data abstractions for both HPC and Big
Data platforms
q We can use Mpi-based MapReduce, but we loose all BD existing facilities.
q Solution: Connection of MPI applications and Spark.
q Avoid data copies between simulation and analysis every iteration.
q HPC and BigData use different file systems
q Copying data will lead to poor performance and huge storage space
q Solution: Scalable I/O system architecture.
q Have data-aware allocation of tasks in HPC.
q Schedulers are CPU oriented
q Solution: connecting scheduler with data allocation.
University Carlos III of Madrid
But we need to …
16
q HPC and BD have separate computing environment heritages.
q Data: R, Python, Hadoop, MAHOUT, MLLIB, SPARK
q HPC: Fortran, C, C++, BLAS, LAPACK, HSL, PETSc, Trilinos.
q Determine capabilities, requirements (application, system, user),
opportunities and gaps for:
q Leveraging HPC library capabilities in BD (e.g., scalable solvers).
q Providing algorithms in native BD environments.
q Providing HPC apps, libraries as appliances (containers aaS).
University Carlos III of Madrid
Convergence in programming environments?
17
MapReduce is the leading paradigm
q A simple programming model
q Functional model
q A combination of the Map and Reduce models with an associated implementation
q For large-scale data processing
q Exploits large set of commodity computers
q Executes process in distributed manner
q Offers high availability
q Used for processing and generating large data sets
University Carlos III of Madrid
18
Data-driven distribution
q In a MapReduce cluster, data is distributed to all the nodes of the cluster as it is
being loaded in.
q An underlying distributed file systems (HDFS) splits large data files into chunks
which are managed by different nodes in the cluster
q Even though the file chunks are distributed across several machines, they form a
single namespace (key, value)
q Scale: Large number of commodity hardware disks: say, 1000 disks 1TB each
Input data: A large file
Node 1
Chunk of input data
Node 2
Chunk of input data
Node 3
Chunk of input data
University Carlos III of Madrid
19
q Benchmark for comparing: Jim Gray’s challenge on data-intensive
computing. Ex: “Sort”
q Google uses it (we think) for wordcount, adwords, pagerank, indexing
data.
q Simple algorithms such as grep, text-indexing, reverse indexing
q Bayesian classification: data mining domain
q Facebook uses it for various operations: demographics
q Financial services use it for analytics
q Astronomy: Gaussian analysis for locating extra-terrestrial objects.
q Expected to play a critical role in semantic web and web3.0
Classes of problems “mapreducable”
University Carlos III of Madrid
20
q Find the way to divide the original simulation
q into smaller independent simulations (BSP model)
q Analyse the original simulation domain in order to find an independent
variable Tx that can act as index for the partitioned input data.
q Independent time-domain steps
q Spatial divisions
q Range of simulation parameters
The goal is to run the same simulation kernel but on fragments of the full
partitioned data set
University Carlos III of Madrid
Data-centric adaptation
21
q Data adaptation phase: first Map-Reduce task
q Reads the input files and indexes all the necessary parameters by Tx
q Reducers provide intermediate <key, value> output for next step
q The original data is partitioned
q Subsequent simulations can run autonomously for each (Tx; parameters) entry.
q Simulation phase: second Map-Reduce task
q Runs the simulation kernel for each value of the independent variable
q With the necessary data that was mapped to them in the previous stage
q Plus the required simulation parameters that are common for every partition
q Reducers are able to gather all the output and provide final results as the
original application.
University Carlos III of Madrid
Methodology: two phase approach
22 University Carlos III of Madrid
Data-driven architectural model
"Efficient design assessment in the railway electric infrastructure domain using cloud computing", S. Caíno-Lores, A. García, F.
García-Carballeira, J. Carretero, Integrated Computer-Aided Engineng, vol. 24, no. 1, pp. 57-72, December, 2016.
23 University Carlos III of Madrid
Hydrogeology simulator adaptation
q The ensemble of realizations constitute the parallelizable domain (i.e. key).
q Columns of the model are distributed per realization.
24 University Carlos III of Madrid
Problem: Scalability
Cluster
EC2
25
q MR-MPI
q Open-source implementation of MapReduce written for distributed-memory
parallel machines on top of standard MPI message passing.
q C++ and C interfaces and a Python wrapper
q MIMIR
q Mimir can handle 16 X larger dataset in-memory compared with MR-MPI
q Mimir scale to 16,384 processes
q Mimir is a open-source https://github.com/TauferLab/Mimir.git
University Carlos III of Madrid
MapReduce framework for MPI
} http://mapreduce.sandia.gov/
} [1] T. Gao, Y. Guo, B. Zhang, P. Cicotti, Y. Lu, P. Balaji, and M. Taufer. Mimir: Memory-Efficient and Scalable
MapReduce for Large Supercomputing Systems. In Proceedings of the IPDPS, 2017.
26
q Single-node execution (24 processes, 128G memory)
qBenchmarks: WC with Wikipedia dataset
qSettings: MR-MPI (64M page and 512M page); Mimir (64M page)
Mimir vs. MR-MPI: WordCount on Comet
Mimir can
handle
4X larger
dataset
64X
4X
University Carlos III of Madrid
27
q The answer is no…
q Programming environments do not match.
q Uuppps! The users are not very happy.
q What can we do?
q Program in Spark and transparently jump to MPI world
q How: using the RDD anstraction of Spark and the topologies of MPI
q Is there a solution? Not yet, working on it: ARCOS + Argonne Labs
q A fist proof of concept is running. Happy ? Not yet, but …
University Carlos III of Madrid
But, can I run my Spark program on it…?
28
q To layer the Spark application model on top of a
MPI-based library
q Goals:
q Minimise the user knowledge of the underlying data model
q Expose explicit interoperability
q Preserve the nature of the Spark interface
q Support multiple data types
q Platform:
qMinimise data transfers
qIntegrate with the framework without changes
University Carlos III of Madrid
Proposal: connecting MPI and Spark
29 University Carlos III of Madrid
Spark on MPI
30
q Roles of storage in HPC systems
q Data collection I/O
q Analysis I/O. Logging.
q Defensive I/O. Checkpointing
q Big Data requires:
q Near-storage
q Replication and
q Elasticity
Convergence in storage system is also needed
University Carlos III of Madrid
31
Scalable I/O system architecture
.... ….
Compute nodes
I/O nodes
Storage nodes
Back-end storage
NVRM
NVRMNVRMNVRM
• Hybrid RAM/NVRAM local storage
• Active participation in data and metadata
management
• Use many-core nodes computational power
and fast inteconnection network
• Scalability with system for some workloads
(burst scheduling)
• Hybrid NVRAM/HD
• Burst buffers for absorbing peaks of load
• Intermediate storage for (small) temporal
loads
• Parallel/distributed file system (e.g.:
GPFS, Lustre, PVFS)
• Global system image
• Storage object devices
• High performance storage access
LSS
GSS
University Carlos III of Madrid
32
q Dynamic deployment of I/O tools needed
q Application guided, less metadata
q Mostly memory based (but finally persistent)
q Static hierarchical FS will not do it (alone)
q Need to enhance data locality with load-balance in application
execution
q Computing and data intensive computing on same systems (HPC, HTC and
workflows)
q Process in-site, don´t store temporal data (to GSS)
Ad-hoc (in-memory) local storage
University Carlos III of Madrid
33
LSS proposal: Hercules
University Carlos III of Madrid
34
q Now:
q HDFS and algorithmic hashing placemente in Big Data
q Optimization of load balance in HPC
q We need to be data locality-aware
q Place RDD in node memory or local storage.
q Execute MPI/analytic tasks in the node containing the data
q Problem: to know where data is to keep load balance:
q Data-aware placement
q Connect scheduler with Hercules to ask for data allocation
Data-aware scheduling
University Carlos III of Madrid
35
q Vertical coordination
q Map application models on storage models
q Coordinate multiple level buffering/caching for latency hiding
q Vertical data flow control: compute nodes <-> I/O nodes <-> file system
<-> storage
q Multiple-level write-back / write-though, Multiple-level prefetching
q Horizontal coordination
q Collective I/O on compute nodes
q I/O storage, aggregation, and operations on the I/O nodes
University Carlos III of Madrid
Problem: coordination LSS - GSS
F. Isaila et all. Design and evaluation of multiple level data staging for Blue Gene systems. In IEEE Trans. of Parallel and
Distributed Systems, 2011.
36
q Upcoming HPC and Big Data applications need hybrid infrastructures
and execution platforms.
q Storage platforms convergence among HPC and BD is a must
q Integrate with memory-centric ad-hoc storage systems.
q Create mechanisms to induce data locality in HPC-oriented paradigms.
q Co-location of data and computation to improve performance
q HPC scheduler must be data locality-aware
q Data allocation must be CPU-aware
q We need efficient collective communication mechanisms in Big Data
platforms
q Joining MPI ang Spark models through RDDs
University Carlos III of Madrid
Conclusions
Towards Unification of HPC and Big
Data Paradigms
Universidad Complutense de Madrid
Computer Science and Engineering Department
University Carlos III of Madrid
Jesús Carretero
jcarrete@inf.uc3m.es

More Related Content

Similar to HPC and Big Data Paradigms Convergence

Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22marpierc
 
Trivento summercamp fast data 9/9/2016
Trivento summercamp fast data 9/9/2016Trivento summercamp fast data 9/9/2016
Trivento summercamp fast data 9/9/2016Stavros Kontopoulos
 
Slide 1
Slide 1Slide 1
Slide 1butest
 
Slide 1
Slide 1Slide 1
Slide 1butest
 
Apache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataApache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataPaco Nathan
 
2010 Future of Advanced Computing
2010 Future of Advanced Computing2010 Future of Advanced Computing
2010 Future of Advanced ComputingBob Marcus
 
A Mapping-Based Framework for the Integration of Machine Data and Information...
A Mapping-Based Framework for the Integration of Machine Data and Information...A Mapping-Based Framework for the Integration of Machine Data and Information...
A Mapping-Based Framework for the Integration of Machine Data and Information...heigoo
 
Practitioner's perspective on High Performance Computing services for innovat...
Practitioner's perspective on High Performance Computing services for innovat...Practitioner's perspective on High Performance Computing services for innovat...
Practitioner's perspective on High Performance Computing services for innovat...Huawei Enterprise Hong Kong
 
OpenACC and Hackathons Monthly Highlights
OpenACC and Hackathons Monthly HighlightsOpenACC and Hackathons Monthly Highlights
OpenACC and Hackathons Monthly HighlightsOpenACC
 
MAD skills for analysis and big data Machine Learning
MAD skills for analysis and big data Machine LearningMAD skills for analysis and big data Machine Learning
MAD skills for analysis and big data Machine LearningGianvito Siciliano
 
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Enrico Daga
 
Quantum Computing in Cloud
Quantum Computing in CloudQuantum Computing in Cloud
Quantum Computing in CloudAnil Loutombam
 
CENTRE FOR DATA CENTER WITH DIAGRAMS.ppt
CENTRE FOR DATA CENTER WITH DIAGRAMS.pptCENTRE FOR DATA CENTER WITH DIAGRAMS.ppt
CENTRE FOR DATA CENTER WITH DIAGRAMS.pptdhanasekarscse
 
Automatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPCAutomatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPCFacultad de Informática UCM
 
Scientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitScientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitGanesan Narayanasamy
 
Cluster Tutorial
Cluster TutorialCluster Tutorial
Cluster Tutorialcybercbm
 
Intel Faster Risk Oct08 - Vassil Alexandrov
Intel Faster Risk Oct08 - Vassil AlexandrovIntel Faster Risk Oct08 - Vassil Alexandrov
Intel Faster Risk Oct08 - Vassil Alexandrovmikeohara
 

Similar to HPC and Big Data Paradigms Convergence (20)

Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
Big dataanalyticsbeyondhadoop public_20_june_2013
Big dataanalyticsbeyondhadoop public_20_june_2013Big dataanalyticsbeyondhadoop public_20_june_2013
Big dataanalyticsbeyondhadoop public_20_june_2013
 
Trivento summercamp fast data 9/9/2016
Trivento summercamp fast data 9/9/2016Trivento summercamp fast data 9/9/2016
Trivento summercamp fast data 9/9/2016
 
Slide 1
Slide 1Slide 1
Slide 1
 
Slide 1
Slide 1Slide 1
Slide 1
 
Apache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big DataApache Spark and the Emerging Technology Landscape for Big Data
Apache Spark and the Emerging Technology Landscape for Big Data
 
2010 Future of Advanced Computing
2010 Future of Advanced Computing2010 Future of Advanced Computing
2010 Future of Advanced Computing
 
A Mapping-Based Framework for the Integration of Machine Data and Information...
A Mapping-Based Framework for the Integration of Machine Data and Information...A Mapping-Based Framework for the Integration of Machine Data and Information...
A Mapping-Based Framework for the Integration of Machine Data and Information...
 
Practitioner's perspective on High Performance Computing services for innovat...
Practitioner's perspective on High Performance Computing services for innovat...Practitioner's perspective on High Performance Computing services for innovat...
Practitioner's perspective on High Performance Computing services for innovat...
 
OpenACC and Hackathons Monthly Highlights
OpenACC and Hackathons Monthly HighlightsOpenACC and Hackathons Monthly Highlights
OpenACC and Hackathons Monthly Highlights
 
MAD skills for analysis and big data Machine Learning
MAD skills for analysis and big data Machine LearningMAD skills for analysis and big data Machine Learning
MAD skills for analysis and big data Machine Learning
 
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
Towards a Smart (City) Data Science. A case-based retrospective on policies, ...
 
Quantum Computing in Cloud
Quantum Computing in CloudQuantum Computing in Cloud
Quantum Computing in Cloud
 
Elective II.pdf
Elective II.pdfElective II.pdf
Elective II.pdf
 
CENTRE FOR DATA CENTER WITH DIAGRAMS.ppt
CENTRE FOR DATA CENTER WITH DIAGRAMS.pptCENTRE FOR DATA CENTER WITH DIAGRAMS.ppt
CENTRE FOR DATA CENTER WITH DIAGRAMS.ppt
 
Automatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPCAutomatic generation of hardware memory architectures for HPC
Automatic generation of hardware memory architectures for HPC
 
AI Super computer update
AI Super computer update AI Super computer update
AI Super computer update
 
Scientific Application Development and Early results on Summit
Scientific Application Development and Early results on SummitScientific Application Development and Early results on Summit
Scientific Application Development and Early results on Summit
 
Cluster Tutorial
Cluster TutorialCluster Tutorial
Cluster Tutorial
 
Intel Faster Risk Oct08 - Vassil Alexandrov
Intel Faster Risk Oct08 - Vassil AlexandrovIntel Faster Risk Oct08 - Vassil Alexandrov
Intel Faster Risk Oct08 - Vassil Alexandrov
 

More from Facultad de Informática UCM

¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?Facultad de Informática UCM
 
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...Facultad de Informática UCM
 
DRAC: Designing RISC-V-based Accelerators for next generation Computers
DRAC: Designing RISC-V-based Accelerators for next generation ComputersDRAC: Designing RISC-V-based Accelerators for next generation Computers
DRAC: Designing RISC-V-based Accelerators for next generation ComputersFacultad de Informática UCM
 
Tendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura ArmTendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura ArmFacultad de Informática UCM
 
Introduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented ComputingIntroduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented ComputingFacultad de Informática UCM
 
Inteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuroInteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuroFacultad de Informática UCM
 
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 Design Automation Approaches for Real-Time Edge Computing for Science Applic... Design Automation Approaches for Real-Time Edge Computing for Science Applic...
Design Automation Approaches for Real-Time Edge Computing for Science Applic...Facultad de Informática UCM
 
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...Facultad de Informática UCM
 
Fault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error CorrectionFault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error CorrectionFacultad de Informática UCM
 
Cómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intentoCómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intentoFacultad de Informática UCM
 
Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...Facultad de Informática UCM
 
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...Facultad de Informática UCM
 
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.Facultad de Informática UCM
 
Challenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore windChallenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore windFacultad de Informática UCM
 
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...Facultad de Informática UCM
 

More from Facultad de Informática UCM (20)

¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?¿Por qué debemos seguir trabajando en álgebra lineal?
¿Por qué debemos seguir trabajando en álgebra lineal?
 
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
TECNOPOLÍTICA Y ACTIVISMO DE DATOS: EL MAPEO COMO FORMA DE RESILIENCIA ANTE L...
 
DRAC: Designing RISC-V-based Accelerators for next generation Computers
DRAC: Designing RISC-V-based Accelerators for next generation ComputersDRAC: Designing RISC-V-based Accelerators for next generation Computers
DRAC: Designing RISC-V-based Accelerators for next generation Computers
 
uElectronics ongoing activities at ESA
uElectronics ongoing activities at ESAuElectronics ongoing activities at ESA
uElectronics ongoing activities at ESA
 
Tendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura ArmTendencias en el diseño de procesadores con arquitectura Arm
Tendencias en el diseño de procesadores con arquitectura Arm
 
Formalizing Mathematics in Lean
Formalizing Mathematics in LeanFormalizing Mathematics in Lean
Formalizing Mathematics in Lean
 
Introduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented ComputingIntroduction to Quantum Computing and Quantum Service Oriented Computing
Introduction to Quantum Computing and Quantum Service Oriented Computing
 
Computer Design Concepts for Machine Learning
Computer Design Concepts for Machine LearningComputer Design Concepts for Machine Learning
Computer Design Concepts for Machine Learning
 
Inteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuroInteligencia Artificial en la atención sanitaria del futuro
Inteligencia Artificial en la atención sanitaria del futuro
 
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 Design Automation Approaches for Real-Time Edge Computing for Science Applic... Design Automation Approaches for Real-Time Edge Computing for Science Applic...
Design Automation Approaches for Real-Time Edge Computing for Science Applic...
 
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
Estrategias de navegación para robótica móvil de campo: caso de estudio proye...
 
Fault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error CorrectionFault-tolerance Quantum computation and Quantum Error Correction
Fault-tolerance Quantum computation and Quantum Error Correction
 
Cómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intentoCómo construir un chatbot inteligente sin morir en el intento
Cómo construir un chatbot inteligente sin morir en el intento
 
Type and proof structures for concurrency
Type and proof structures for concurrencyType and proof structures for concurrency
Type and proof structures for concurrency
 
Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...Hardware/software security contracts: Principled foundations for building sec...
Hardware/software security contracts: Principled foundations for building sec...
 
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
Jose carlossancho slidesLa seguridad en el desarrollo de software implementad...
 
Do you trust your artificial intelligence system?
Do you trust your artificial intelligence system?Do you trust your artificial intelligence system?
Do you trust your artificial intelligence system?
 
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.Redes neuronales y reinforcement learning. Aplicación en energía eólica.
Redes neuronales y reinforcement learning. Aplicación en energía eólica.
 
Challenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore windChallenges and Opportunities for AI and Data analytics in Offshore wind
Challenges and Opportunities for AI and Data analytics in Offshore wind
 
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
Evolution and Trends in Edge AI Systems and Architectures for the Internet of...
 

Recently uploaded

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...ranjana rawat
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingrknatarajan
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 

Recently uploaded (20)

High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
(TARA) Talegaon Dabhade Call Girls Just Call 7001035870 [ Cash on Delivery ] ...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and workingUNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 

HPC and Big Data Paradigms Convergence

  • 1. Towards Unification of HPC and Big Data Paradigms Universidad Complutense de Madrid Conferencias de Postgrado Computer Science and Engineering Department University Carlos III of Madrid Jesús Carretero jcarrete@inf.uc3m.es
  • 2. 2 q Inference Spiral of System Science q As models become more complex and new data bring in more information, we require ever increasing computational resources University Carlos III of Madrid Science research is changing
  • 3. 3 Who is generating Big Data Social media and networks (all of us are generating data) Scientific instruments (collecting all sorts of data) Mobile devices (tracking all objects all the time) Sensor technology and networks (measuring all kinds of data) Companies and e-commerce (Collecting and warehousing data) University Carlos III of Madrid
  • 4. 4 q Simulation has become the way to research and develop new scientific and engineering solutions. q Used nowadays in leading science domains like aerospace industry, astrophysics, etc. q Challenges related to the complexity, scalability and data production of the simulators arise. q Impact on the relaying IT infrastructure. University Carlos III of Madrid Parallel applications require more data everyday …
  • 5. 5 IoT: the paradigmatic challenge University Carlos III of Madrid q The progress and innovation is no longer hindered by the ability to collect data q But, by the ability to manage, analyze, summarize, visualize, and discover knowledge from the collected data in a timely manner and in a scalable fashion
  • 6. 6 q Cross fertilization among High Performance Computing (HPC), large scale distributed systems, and big data management is needed. q Mechanisms should be valid for HPC, HTC and workflows … q Data will play an increasingly substantial role in the near future q Huge amounts of data produced by real-world devices, applications and systems (checkpoint, monitoring, …) Cross fertilization needed University Carlos III of Madrid
  • 7. 7 q HPC Simulations and data q Challenges related to the complexity, scalability and data production of the simulators arise. qHigh-Performance data analytics (HPDA) qMore input data (ingestion) qMore output data for integration/analysis qReal time, near-real time requirements University Carlos III of Madrid Areas of convergence
  • 8. 8 q Systems are expensive and not integrating misses opportunities q Leveraging investments and purchasing power q Integration of Computation and Observation cycles implicitly requires convergence q Expanded cross disciplinary teams of researchers are needed to explore the most challenging problems for society q Data Consolidation trends span Big Data and HPC q Categorization of Data q Structured, Semi-structured and Unstructured Data q Computer Generated and Observed Data University Carlos III of Madrid HPC-BD convergence motivation
  • 9. 9 Context HPC BIG DATA q Focus: large volumes of loosely-coupled tasks. q Architecture: co-located computation and data, elasticity is required. University Carlos III of Madrid q Focus: CPU-intensive tightly-coupled applications q Architecture: compute and storage are decoupled, high- speed interconnections. HPC-Big Data convergence is a must q Data-intensive scientic computing q High-performance data analytics q Convergence at the infrastructure layer q virtualisation for HPC, deeper storage hierarchy, …
  • 10. 10 HPC and Big Data models q HPC requires Computing-Centric Models (CCM) q Big Data requires Data- Centric Models (DCM) University Carlos III of Madrid
  • 11. 11 Platforms & paradigms Physical or virtual q Clusters and supercomputers qHPC and supercomputing q Clouds qVirtualized resources qHigher-level model University Carlos III of Madrid General or specific q Processing paradigms q Open MP and MPI qCollective model (PGAs,…) q MapReduce model q Iterative MapReduce model q DAG model q Graph model
  • 12. 12 University Carlos III of Madrid Data analytics and computing ecosystem compared Daniel A. Reed And Jack Dongarra. Exascale Computing and Big Data.Communications Of The Acm. 58(1). July 2015. 7
  • 13. 13 q HPC system University Carlos III of Madrid Non-Convergent system architectures q Big Data Platforms Compute farm Storage farm High speed network Network Local disk processor Virtualized resourcesPhysical resources
  • 14. 14 q Traditional approach: open loop University Carlos III of Madrid Integration of computation and observation Simulationdata On-line analytics results q Desired approach: closed loop results visualization Simulationdata Off-line analytics results
  • 15. 15 q Integrate the platform layer and data abstractions for both HPC and Big Data platforms q We can use Mpi-based MapReduce, but we loose all BD existing facilities. q Solution: Connection of MPI applications and Spark. q Avoid data copies between simulation and analysis every iteration. q HPC and BigData use different file systems q Copying data will lead to poor performance and huge storage space q Solution: Scalable I/O system architecture. q Have data-aware allocation of tasks in HPC. q Schedulers are CPU oriented q Solution: connecting scheduler with data allocation. University Carlos III of Madrid But we need to …
  • 16. 16 q HPC and BD have separate computing environment heritages. q Data: R, Python, Hadoop, MAHOUT, MLLIB, SPARK q HPC: Fortran, C, C++, BLAS, LAPACK, HSL, PETSc, Trilinos. q Determine capabilities, requirements (application, system, user), opportunities and gaps for: q Leveraging HPC library capabilities in BD (e.g., scalable solvers). q Providing algorithms in native BD environments. q Providing HPC apps, libraries as appliances (containers aaS). University Carlos III of Madrid Convergence in programming environments?
  • 17. 17 MapReduce is the leading paradigm q A simple programming model q Functional model q A combination of the Map and Reduce models with an associated implementation q For large-scale data processing q Exploits large set of commodity computers q Executes process in distributed manner q Offers high availability q Used for processing and generating large data sets University Carlos III of Madrid
  • 18. 18 Data-driven distribution q In a MapReduce cluster, data is distributed to all the nodes of the cluster as it is being loaded in. q An underlying distributed file systems (HDFS) splits large data files into chunks which are managed by different nodes in the cluster q Even though the file chunks are distributed across several machines, they form a single namespace (key, value) q Scale: Large number of commodity hardware disks: say, 1000 disks 1TB each Input data: A large file Node 1 Chunk of input data Node 2 Chunk of input data Node 3 Chunk of input data University Carlos III of Madrid
  • 19. 19 q Benchmark for comparing: Jim Gray’s challenge on data-intensive computing. Ex: “Sort” q Google uses it (we think) for wordcount, adwords, pagerank, indexing data. q Simple algorithms such as grep, text-indexing, reverse indexing q Bayesian classification: data mining domain q Facebook uses it for various operations: demographics q Financial services use it for analytics q Astronomy: Gaussian analysis for locating extra-terrestrial objects. q Expected to play a critical role in semantic web and web3.0 Classes of problems “mapreducable” University Carlos III of Madrid
  • 20. 20 q Find the way to divide the original simulation q into smaller independent simulations (BSP model) q Analyse the original simulation domain in order to find an independent variable Tx that can act as index for the partitioned input data. q Independent time-domain steps q Spatial divisions q Range of simulation parameters The goal is to run the same simulation kernel but on fragments of the full partitioned data set University Carlos III of Madrid Data-centric adaptation
  • 21. 21 q Data adaptation phase: first Map-Reduce task q Reads the input files and indexes all the necessary parameters by Tx q Reducers provide intermediate <key, value> output for next step q The original data is partitioned q Subsequent simulations can run autonomously for each (Tx; parameters) entry. q Simulation phase: second Map-Reduce task q Runs the simulation kernel for each value of the independent variable q With the necessary data that was mapped to them in the previous stage q Plus the required simulation parameters that are common for every partition q Reducers are able to gather all the output and provide final results as the original application. University Carlos III of Madrid Methodology: two phase approach
  • 22. 22 University Carlos III of Madrid Data-driven architectural model "Efficient design assessment in the railway electric infrastructure domain using cloud computing", S. Caíno-Lores, A. García, F. García-Carballeira, J. Carretero, Integrated Computer-Aided Engineng, vol. 24, no. 1, pp. 57-72, December, 2016.
  • 23. 23 University Carlos III of Madrid Hydrogeology simulator adaptation q The ensemble of realizations constitute the parallelizable domain (i.e. key). q Columns of the model are distributed per realization.
  • 24. 24 University Carlos III of Madrid Problem: Scalability Cluster EC2
  • 25. 25 q MR-MPI q Open-source implementation of MapReduce written for distributed-memory parallel machines on top of standard MPI message passing. q C++ and C interfaces and a Python wrapper q MIMIR q Mimir can handle 16 X larger dataset in-memory compared with MR-MPI q Mimir scale to 16,384 processes q Mimir is a open-source https://github.com/TauferLab/Mimir.git University Carlos III of Madrid MapReduce framework for MPI } http://mapreduce.sandia.gov/ } [1] T. Gao, Y. Guo, B. Zhang, P. Cicotti, Y. Lu, P. Balaji, and M. Taufer. Mimir: Memory-Efficient and Scalable MapReduce for Large Supercomputing Systems. In Proceedings of the IPDPS, 2017.
  • 26. 26 q Single-node execution (24 processes, 128G memory) qBenchmarks: WC with Wikipedia dataset qSettings: MR-MPI (64M page and 512M page); Mimir (64M page) Mimir vs. MR-MPI: WordCount on Comet Mimir can handle 4X larger dataset 64X 4X University Carlos III of Madrid
  • 27. 27 q The answer is no… q Programming environments do not match. q Uuppps! The users are not very happy. q What can we do? q Program in Spark and transparently jump to MPI world q How: using the RDD anstraction of Spark and the topologies of MPI q Is there a solution? Not yet, working on it: ARCOS + Argonne Labs q A fist proof of concept is running. Happy ? Not yet, but … University Carlos III of Madrid But, can I run my Spark program on it…?
  • 28. 28 q To layer the Spark application model on top of a MPI-based library q Goals: q Minimise the user knowledge of the underlying data model q Expose explicit interoperability q Preserve the nature of the Spark interface q Support multiple data types q Platform: qMinimise data transfers qIntegrate with the framework without changes University Carlos III of Madrid Proposal: connecting MPI and Spark
  • 29. 29 University Carlos III of Madrid Spark on MPI
  • 30. 30 q Roles of storage in HPC systems q Data collection I/O q Analysis I/O. Logging. q Defensive I/O. Checkpointing q Big Data requires: q Near-storage q Replication and q Elasticity Convergence in storage system is also needed University Carlos III of Madrid
  • 31. 31 Scalable I/O system architecture .... …. Compute nodes I/O nodes Storage nodes Back-end storage NVRM NVRMNVRMNVRM • Hybrid RAM/NVRAM local storage • Active participation in data and metadata management • Use many-core nodes computational power and fast inteconnection network • Scalability with system for some workloads (burst scheduling) • Hybrid NVRAM/HD • Burst buffers for absorbing peaks of load • Intermediate storage for (small) temporal loads • Parallel/distributed file system (e.g.: GPFS, Lustre, PVFS) • Global system image • Storage object devices • High performance storage access LSS GSS University Carlos III of Madrid
  • 32. 32 q Dynamic deployment of I/O tools needed q Application guided, less metadata q Mostly memory based (but finally persistent) q Static hierarchical FS will not do it (alone) q Need to enhance data locality with load-balance in application execution q Computing and data intensive computing on same systems (HPC, HTC and workflows) q Process in-site, don´t store temporal data (to GSS) Ad-hoc (in-memory) local storage University Carlos III of Madrid
  • 33. 33 LSS proposal: Hercules University Carlos III of Madrid
  • 34. 34 q Now: q HDFS and algorithmic hashing placemente in Big Data q Optimization of load balance in HPC q We need to be data locality-aware q Place RDD in node memory or local storage. q Execute MPI/analytic tasks in the node containing the data q Problem: to know where data is to keep load balance: q Data-aware placement q Connect scheduler with Hercules to ask for data allocation Data-aware scheduling University Carlos III of Madrid
  • 35. 35 q Vertical coordination q Map application models on storage models q Coordinate multiple level buffering/caching for latency hiding q Vertical data flow control: compute nodes <-> I/O nodes <-> file system <-> storage q Multiple-level write-back / write-though, Multiple-level prefetching q Horizontal coordination q Collective I/O on compute nodes q I/O storage, aggregation, and operations on the I/O nodes University Carlos III of Madrid Problem: coordination LSS - GSS F. Isaila et all. Design and evaluation of multiple level data staging for Blue Gene systems. In IEEE Trans. of Parallel and Distributed Systems, 2011.
  • 36. 36 q Upcoming HPC and Big Data applications need hybrid infrastructures and execution platforms. q Storage platforms convergence among HPC and BD is a must q Integrate with memory-centric ad-hoc storage systems. q Create mechanisms to induce data locality in HPC-oriented paradigms. q Co-location of data and computation to improve performance q HPC scheduler must be data locality-aware q Data allocation must be CPU-aware q We need efficient collective communication mechanisms in Big Data platforms q Joining MPI ang Spark models through RDDs University Carlos III of Madrid Conclusions
  • 37. Towards Unification of HPC and Big Data Paradigms Universidad Complutense de Madrid Computer Science and Engineering Department University Carlos III of Madrid Jesús Carretero jcarrete@inf.uc3m.es