[Capella Day 2019] Model execution and system simulation in CapellaObeo
A common need in system architecture design is to verify that if the architect is correct and can satisfy its requirements. Execution of system architect model means to interact with state machines to test system’s control logic. It can verify if the logical sequences of functions and interfaces in different scenarios are desired.
However, only sequence itself is not enough to verify its consequence or output. So we need each function to do what it is supposed to do during model execution to verify its output, and that is what we called “system simulation”.
This presentation introduces how we do model execution in Capella, and how to embed digital mockup inside functions to do “system simulation” with a higher confidence.
Renfei Xu, Glaway
Renfei Xu is the technical manager of MBSE solution in Glaway. He has participated in many pilot projects of MBSE in areas like Engine Control, Avionics, Mechatronics and so on. In recent years, he is responsible for the deployment of MBSE using Capella and ARCADIA methodology in a Radar research institute.
Wenhua Fang, Glaway
Wenhua Fang is the Director of Systems Engineering in Glaway. He has more than 12 years of working experience in SE.
He is responsible for more than 10 implementation projects of MBSE in areas like Aircraft, Engine Control, Avionics, Automotive and so on. In recent years, he leads the team to deploy MBSE in China(including using Capella and ARCADIA methodology).
Data Parallel and Object Oriented ModelNikhil Sharma
All the content is taken from Advance Computer Architecture book. Which (10.1.3 and 10.1.4)
This PPT covers the basics of Data-Parallel Model and Object-Oriented Model.
[Capella Day 2019] Model execution and system simulation in CapellaObeo
A common need in system architecture design is to verify that if the architect is correct and can satisfy its requirements. Execution of system architect model means to interact with state machines to test system’s control logic. It can verify if the logical sequences of functions and interfaces in different scenarios are desired.
However, only sequence itself is not enough to verify its consequence or output. So we need each function to do what it is supposed to do during model execution to verify its output, and that is what we called “system simulation”.
This presentation introduces how we do model execution in Capella, and how to embed digital mockup inside functions to do “system simulation” with a higher confidence.
Renfei Xu, Glaway
Renfei Xu is the technical manager of MBSE solution in Glaway. He has participated in many pilot projects of MBSE in areas like Engine Control, Avionics, Mechatronics and so on. In recent years, he is responsible for the deployment of MBSE using Capella and ARCADIA methodology in a Radar research institute.
Wenhua Fang, Glaway
Wenhua Fang is the Director of Systems Engineering in Glaway. He has more than 12 years of working experience in SE.
He is responsible for more than 10 implementation projects of MBSE in areas like Aircraft, Engine Control, Avionics, Automotive and so on. In recent years, he leads the team to deploy MBSE in China(including using Capella and ARCADIA methodology).
Data Parallel and Object Oriented ModelNikhil Sharma
All the content is taken from Advance Computer Architecture book. Which (10.1.3 and 10.1.4)
This PPT covers the basics of Data-Parallel Model and Object-Oriented Model.
The critical routines within signal processing algorithms are typically data intensive and iterate many times, carrying out the same functions on multi-dimensional arrays and input streams. These routines use the vast majority of an implementation’s resources, and for the longest time over the course of the algorithms execution. Such routines are classed as Nested-Loop Programs. Whether the implementation is software running on a processor or a higher performance customized hardware implementation, there is always a tradeoff between the throughput performance (execution-time) and the resources used such as the amount of processor memory and registers in a software implementation or the gate-count in a hardware implementation. This article shows the reader techniques and methods for manipulating a signal processing algorithm, in particular those conforming to a Nested-Loop Program, and the different generic implementation architectures along this tradeoff spectrum, as well as the different methods for describing and analyzing an algorithm implementation. Each of these methods for describing an algorithm is shown to correspond to a different abstraction-level view and as such exposes different features and properties for ease of analysis and manipulation. Manipulation techniques, mainly algorithmic-transformations are described and it is shown how these transformations take an implementation and form a new one at a different point in the trade-off spectrum of resources used versus throughput by performing calculations in a different order and a different way to arrive at the same result.
Application Profiling and Mapping on NoC-based MPSoC Emulation Platform on Re...TELKOMNIKA JOURNAL
In network-on-chip (NoC) based multi-processor system-on-chip (MPSoC) development, application
profiling is one of the most crucial step during design time to search and explore optimal mapping.
Conventional mapping exploration methodologies analyse application-specific graphs by estimating its runtime
behaviour using analytical or simulation models. However, the former does not replicate the actual
application run-time performance while the latter requires significant amount of time for exploration. To map
applications on a specific MPSoC platform, the application behaviour on cycle-accurate emulated platform
should be considered for obtaining better mapping quality. This paper proposes an application mapping
methodology that utilizes a MPSoC prototyped in Field-Programmable Gate Array (FPGA). Applications are
implemented on homogeneous MPSoC cores and their costs are analysed and profiled on the platform
in term of execution time, intra-core communication and inter-core communication delays. These metrics
are utilized in analytical evaluation of the application mapping. The proposed analytical-based mapping is
demonstrated against the exhaustive brute force method. Results show that the proposed method is able to
produce quality mappings compared to the ground truth solutions but in shorter evaluation time.
Taming Latency: Case Studies in MapReduce Data AnalyticsEMC
This session discusses how to achieve low latency in MapReduce data analysis, with various industrial and academic case studies. These illustrate various improvements on MapReduce for squeezing out latency from whole data processing stack, covering batch-mode MapReduce system, as well as stream processing systems. This session also introduces our BoltMR project efforts on this topic and discloses some interesting benchmark results.
Objective 1: Understand why low-latency matters for many MapReduce-based big data analytics scenarios.
After this session you will be able to:
Objective 2: Learn the root causes of MapReduce latency, the obstacles to lowering the latency and the various (im)mature solutions.
Objective 3: Understand the extent of MapReduce low-latency that is needed for their own applications and which optimization techniques are potentially applicable.
Power-Efficient Programming Using Qualcomm Multicore Asynchronous Runtime Env...Qualcomm Developer Network
The need to support both today’s multicore performance and tomorrow’s heterogeneous computing has become increasingly important. Qualcomm® Multicore Asynchronous Runtime Environment (MARE) provides powerful and easy-to-use abstractions to write parallel software. This session will provide a deep dive into the concepts of power-efficient programming and how to use Qualcomm MARE APIs to get energy and thermal benefits for Android apps. Qualcomm Multicore Asynchronous Runtime Environment is a product of Qualcomm Technologies, Inc.
Learn more about Qualcomm Multicore Asynchronous Runtime Environment: https://developer.qualcomm.com/MARE
Watch this presentation on YouTube:
https://www.youtube.com/watch?v=RI8yXhBb8Hg
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...IJCI JOURNAL
This paper presents a newly-created Barracuda open-source framework which aims to parallelize Java divide and conquer applications. This framework exploits implicit for-loop parallelism in dividing and merging operations. So, this makes it a mixture of parallel for-loop and task parallelism. It targets sharedmemory multiprocessors and hybrid distributed shared-memory architectures. We highlight the effectiveness of the framework and focus on the performance gain and programming effort by using this framework. Barracuda aims at large public actors as well as various application domains. In terms of performance achievement, it is very close to Fork/Join framework while allowing end-users to only focus on refactoring code and experts to have the opportunity to improve it.
The critical routines within signal processing algorithms are typically data intensive and iterate many times, carrying out the same functions on multi-dimensional arrays and input streams. These routines use the vast majority of an implementation’s resources, and for the longest time over the course of the algorithms execution. Such routines are classed as Nested-Loop Programs. Whether the implementation is software running on a processor or a higher performance customized hardware implementation, there is always a tradeoff between the throughput performance (execution-time) and the resources used such as the amount of processor memory and registers in a software implementation or the gate-count in a hardware implementation. This article shows the reader techniques and methods for manipulating a signal processing algorithm, in particular those conforming to a Nested-Loop Program, and the different generic implementation architectures along this tradeoff spectrum, as well as the different methods for describing and analyzing an algorithm implementation. Each of these methods for describing an algorithm is shown to correspond to a different abstraction-level view and as such exposes different features and properties for ease of analysis and manipulation. Manipulation techniques, mainly algorithmic-transformations are described and it is shown how these transformations take an implementation and form a new one at a different point in the trade-off spectrum of resources used versus throughput by performing calculations in a different order and a different way to arrive at the same result.
Application Profiling and Mapping on NoC-based MPSoC Emulation Platform on Re...TELKOMNIKA JOURNAL
In network-on-chip (NoC) based multi-processor system-on-chip (MPSoC) development, application
profiling is one of the most crucial step during design time to search and explore optimal mapping.
Conventional mapping exploration methodologies analyse application-specific graphs by estimating its runtime
behaviour using analytical or simulation models. However, the former does not replicate the actual
application run-time performance while the latter requires significant amount of time for exploration. To map
applications on a specific MPSoC platform, the application behaviour on cycle-accurate emulated platform
should be considered for obtaining better mapping quality. This paper proposes an application mapping
methodology that utilizes a MPSoC prototyped in Field-Programmable Gate Array (FPGA). Applications are
implemented on homogeneous MPSoC cores and their costs are analysed and profiled on the platform
in term of execution time, intra-core communication and inter-core communication delays. These metrics
are utilized in analytical evaluation of the application mapping. The proposed analytical-based mapping is
demonstrated against the exhaustive brute force method. Results show that the proposed method is able to
produce quality mappings compared to the ground truth solutions but in shorter evaluation time.
Taming Latency: Case Studies in MapReduce Data AnalyticsEMC
This session discusses how to achieve low latency in MapReduce data analysis, with various industrial and academic case studies. These illustrate various improvements on MapReduce for squeezing out latency from whole data processing stack, covering batch-mode MapReduce system, as well as stream processing systems. This session also introduces our BoltMR project efforts on this topic and discloses some interesting benchmark results.
Objective 1: Understand why low-latency matters for many MapReduce-based big data analytics scenarios.
After this session you will be able to:
Objective 2: Learn the root causes of MapReduce latency, the obstacles to lowering the latency and the various (im)mature solutions.
Objective 3: Understand the extent of MapReduce low-latency that is needed for their own applications and which optimization techniques are potentially applicable.
Power-Efficient Programming Using Qualcomm Multicore Asynchronous Runtime Env...Qualcomm Developer Network
The need to support both today’s multicore performance and tomorrow’s heterogeneous computing has become increasingly important. Qualcomm® Multicore Asynchronous Runtime Environment (MARE) provides powerful and easy-to-use abstractions to write parallel software. This session will provide a deep dive into the concepts of power-efficient programming and how to use Qualcomm MARE APIs to get energy and thermal benefits for Android apps. Qualcomm Multicore Asynchronous Runtime Environment is a product of Qualcomm Technologies, Inc.
Learn more about Qualcomm Multicore Asynchronous Runtime Environment: https://developer.qualcomm.com/MARE
Watch this presentation on YouTube:
https://www.youtube.com/watch?v=RI8yXhBb8Hg
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...IJCI JOURNAL
This paper presents a newly-created Barracuda open-source framework which aims to parallelize Java divide and conquer applications. This framework exploits implicit for-loop parallelism in dividing and merging operations. So, this makes it a mixture of parallel for-loop and task parallelism. It targets sharedmemory multiprocessors and hybrid distributed shared-memory architectures. We highlight the effectiveness of the framework and focus on the performance gain and programming effort by using this framework. Barracuda aims at large public actors as well as various application domains. In terms of performance achievement, it is very close to Fork/Join framework while allowing end-users to only focus on refactoring code and experts to have the opportunity to improve it.
Performance comparison on java technologies a practical approachcsandit
Performance responsiveness and scalability is a make-or-break quality for software. Nearly
everyone runs into performance problems at one time or another. This paper discusses about
performance issues faced during one of the project implemented in java technologies. The
challenges faced during the life cycle of the project and the mitigation actions performed. It
compares 3 java technologies and shows how improvements are made through statistical
analysis in response time of the application. The paper concludes with result analysis.
PERFORMANCE COMPARISON ON JAVA TECHNOLOGIES - A PRACTICAL APPROACHcscpconf
Performance responsiveness and scalability is a make-or-break quality for software. Nearly everyone runs into performance problems at one time or another. This paper discusses about
performance issues faced during one of the project implemented in java technologies. The challenges faced during the life cycle of the project and the mitigation actions performed. It compares 3 java technologies and shows how improvements are made through statistical analysis in response time of the application. The paper concludes with result analysis.
Concurrent Matrix Multiplication on Multi-core ProcessorsCSCJournals
With the advent of multi-cores every processor has built-in parallel computational power and that can only be fully utilized only if the program in execution is written accordingly. This study is a part of an on-going research for designing of a new parallel programming model for multi-core architectures. In this paper we have presented a simple, highly efficient and scalable implementation of a common matrix multiplication algorithm using a newly developed parallel programming model SPC3 PM for general purpose multi-core processors. From our study it is found that matrix multiplication done concurrently on multi-cores using SPC3 PM requires much less execution time than that required using the present standard parallel programming environments like OpenMP. Our approach also shows scalability, better and uniform speedup and better utilization of available cores than that the algorithm written using standard OpenMP or similar parallel programming tools. We have tested our approach for up to 24 cores with different matrices size varying from 100 x 100 to 10000 x 10000 elements. And for all these tests our proposed approach has shown much improved performance and scalability
PAGE: A Partition Aware Engine for Parallel Graph Computation1crore projects
IEEE PROJECTS 2015
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Dot Net
DOTNET Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
Java Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2015
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US
1 CRORE PROJECTS
Door No: 214/215,2nd Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 72999 51536
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxfaithxdunce63732
CS 301 Computer Architecture
Student # 1
E
ID: 09
Kingdom of Saudi Arabia Royal Commission at Yanbu Yanbu University College Yanbu Al-Sinaiyah
Student # 2
H
ID: 09
Kingdom of Saudi Arabia Royal Commission at Yanbu Yanbu University College Yanbu Al-Sinaiyah
1
1. Introduction
High-performance processor design has recently taken two distinct approaches. One approach is to increase the execution rate by increasing the clock frequency of the processor or by reducing the execution latency of the operations. While this approach is important, much of its performance gain comes as a consequence of circuit and layout improvements and is beyond the scope of this research. The other approach is to directly exploit the instruction-level parallelism (ILP) in the program and to issue and execute multiple operations concurrently. This approach requires both compiler and microarchitecture support.
Traditional processor designs that issue and execute at most one operation per cycle are often called scalar designs. Static and dynamic scheduling techniques have been used to achieve better-than scalar performance by issuing and executing more than one operation per cycle. While Johnson[7] defines a superscalar processor as a design that achieves better-than scalar performance, popular usage of this term refers exclusively to those processors that use dynamic scheduling techniques. For clarity, we use instruction-level parallel processors to refer to the general class of processors that execute more than one operation per cycle of the computer both at the personal level, or the level of a small network of computers to do not require more of these types.
The primary static scheduling technique uses the compiler to determine sets of operations that have their source operands ready and have no dependencies within the set. These operations can then be scheduled within the same instruction subject only to hardware resource limits. Since each of the operations in an instruction is guaranteed by the compiler to be independent, the hardware is able to is- sue and execute these operations directly with no dynamic analysis. These multi-operation instructions are very long in comparison with traditional single-operation instructions and processors using .
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Water scarcity is the lack of fresh water resources to meet the standard water demand. There are two type of water scarcity. One is physical. The other is economic water scarcity.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
2. Introduction
Current Programming models
Framework
Architecture of AROM
Features and Execution
Result
Future Work
Conclusion
References
Wayne State University
3. This paper proposes a better tradeoff between implicit
distributed programming, job efficiency and openness in the
design of the processing algorithm.
AROM, a new open source distributed processing framework
is introduced in which jobs are defined using directed acyclic
graphs.
Wayne State University
4. The Map Reduce model
- Pros and Cons
The DFG model
- Pros and Cons
Pipelining Considerations
Wayne State University
5. Composed of phases:
Map phase - Map function executed on every key/value
pair of input data which further produces intermediate
key/value pairs.
Shuffle phase – Intermediate key/value pairs are sorted
and grouped by the key.
Reduce phase – Reduce function is applied individually
on each key and the associated values of key. This
phase produces zero or one output.
Intermediate outputs between each phase are stored on
the filesystem.
Wayne State University
6.
7. Data Flow Graph is a directed acyclic graph where each vertex
represents a program and edges represent data channels.
At runtime the vertices of the DFG will be scheduled to run
on the available machines of the cluster.
Edges represent the flow of the processed data through the
vertices.
Data communication between the vertices is abstracted from
the user and can be physically implemented in several ways
(e.g. network sockets, local filesystem, asynchronous I/O
mechanisms...).
No data model is usually imposed, it is up to the user to
handle the input and output formats for each vertex program.
Wayne State University
8. MapReduce model:
Pros:
Convenient to work with
Simple for most users to implement
Cons:
Sort phase is not necessarily required for each job
Shuffle phase represents one of the most restrictive weakness of the
MapReduce model.
Joins in particular have proven to be cumbersome to express.
back
Wayne State University
9. DFG Model
Pros:
Provides the opportunity to implement jobs that are not constrained by a strict
Map-Shuffle-Reduce schema.
As vertex programs are able receive multiple inputs, it is possible to implement
relational operations such as joins in a natural and distributed fashion.
Since the jobs are not framed in a particular sequence of phases, intermediate
output data do not have to be stored into the filesystem.
DFG model is less constrained than the MapReduce model. It is more
convenient to compile a job from a higher-level language.
Cons:
DFG model is more general than the MapReduce model which can be seen as a
restriction on the DFG model.
go back
Wayne State University
10.
11. Even if we could implement pipelined jobs in MapReduce, this is not as natural
nor efficient when compared to a pipeline defined using DFG where each stage
can be represented by a group of vertices.
The stages of the the pipelines would consist of Map- only jobs.
The mandatory shuffle phase may not even be useful for the pipeline stages.
The back and forth coming of the intermediate data on the distributed
filesystem can also cause a significant performance penalty.
Additional code is also required for coordinating and chaining the separate
MapReduce stages and in the end the iterations are diluted, making the code
less obvious to understand.
Each stage of the pipeline in MapReduce needs to wait for the completion of
the previous stage in order to begin the processing.
Wayne State University
12. The requirements of the framework are the following:
1) Provide a coherent API to express jobs as DFG and use a
programming paradigm which favors reusable and generic
operators.
2) Base the architecture on asynchronous actors and event-
driven constructs which make it suitable for large scale
deployment on Cloud Computing environments.
The vision behind AROM is to provide a generic-purpose
environment for testing and prototyping on distributed parallel
processing jobs and models.
Wayne State University
13. Job Definition
Operators and Data Model
Processing Stages
Enforcing Genericity and Reusability of Operators
Operator Types
Job Scheduling and Execution
Specificities
next
Wayne State University
14. Job Definition:
- Jobs described by Data Flow Graphs
- A vertex (also called operator), receives data on its input, processes it and emits the
resulting data to the next operator.
- Edges between the vertices depict the flow of the data between the operators.
Operators and Data Model:
- An operator can receive and emit data on multiple inputs.
- The data form is (i,d) indicating the data “d” is handed through the incoming
edge(i) connected to the upstream operator.
Processing Stages:
- Two Principal phases : Process and Finish
- Process phase is first phase in which resides the logic. This logic is executed on
the data units as they arrive
- Final phase triggers once all the entries have been received and processed.
- Users can define their own logic and functions.
Wayne State University
15. Enforcing Genericity and Reusability of Operators:
- AROM permits to develop generic and reusable operators.
- The generic filtering operator is used to apply a user defined function on the input to
determine if the data should be transmitted to downstream operator.
- Only Predicate Function is required to return TRUE or FALSE depending on the input
data.
Operator Types:
- Different operator interfaces are available, each differing in their process cycle and in the
way they handle the incoming data.
- Asynchronous: processes the entries as they come from the upstream operators, in a first-
come, first-served basis.
- Synchronous: processes only once when there are entries present at each of its upstream
inputs.
- Vector: A vector operator operates on batches of entries provided by the upstream
operator.
- Scalar: A scalar operator operates on entries one at the time.
Wayne State University
16. Job Scheduling and Execution:
- Master-Slave Architecture.
- The scheduling of the operators is based on:
- the synchronous/asynchronous nature of the operator
- the source nature of the operators
- Master : - Select the slave node among those available.
-Send the code of the operator to the selected slave.
-Update all the slaves running a predecessor operator with location of new operator.
Specificities :
- AROM is implemented using Scala, a functional programming language which compiles to
the Java JVM.
- Enables the usage of anonymous and higher order functions.
- Scala binaries enables to include Java code and reuse the existing libraries.
- Scala also enables the use of the powerful Scala collections API which comprises natural
handling of tuples.
Wayne State University
17.
18.
19.
20.
21. Better performance for the AROM framework.
The MapReduce model is in fact a restriction of the more general DFG
model.
Using DFG for defining the job leaves more possibility for optimization.
Compared to the MapReduce version, it was possible to implement at least
two optimizations.
A third possible optimization would be to directly propagate the count of
the inbound articles from the first iteration to all the following iterations in
order to transmit the results of the computation earlier for each stage.
Wayne State University
22. Migrate the implementation to a more scalable stack.
Plans to further develop the scheduling.
Add speculative executions scheduling in order to minimize the
impact of an operator failure on performance.
Scheduler should be able to react to additional resources available in
the pool of workers and extend or reduce the capacity of the
scheduling in scenarios of scaling out and scaling in.
DFG execution plan should be modifiable at runtime.
Wayne State University
23. AROM implementation of a distributed parallel
processing framework using directed acyclic graphs.
The primary goal of the framework is to provide a
playground for testing and developing on distributed
parallel processing.
Model is based on the general DFG Processing model.
Paradigms from functional programming are used.
Wayne State University
24. http://doi.acm.org/10.1145/1807167.1807273
http://doi.acm.org/10.1145/1272998.1273005
http://doi.acm.org/10.1145/1376616.1376726
http://dl.acm.org/citation.cfm?id=2008928.2008952
http://doi.acm.org/10.1145/1809028.1806638
http: //doi.acm.org/10.1145/1247480.1247602
http://dl.acm.org/citation.cfm?id=1855741.1855742
http://doi.acm.org/10.1145/1559845.1559962
http://dl.acm.org/citation.cfm?id=1863103.1863113
http://ilpubs.stanford.edu:8090/422/
Wayne State University