The implementation of multi-context systems over coarse-grained reconfigurable platforms could bring several benefits in terms of efficient resource usage and power management. Nevertheless on-the-fly reconfiguration and mapping are not so straightforward and the optimal configuration of the substrate could be extremely time consuming. In this paper we present an early stage design space exploration methodology intended for dataflow-based design flows where multiple input specifications have to be taken into account. The proposed approach, coupled to the Multi-Dataflow Composer tool, has been exploited to assemble the central reconfigurable computing core of an accelerator for video/image processing.
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
In this deck from the Stanford HPC Conference, Nick Nystrom and Paola Buitrago provide an update from the Pittsburgh Supercomputing Center.
Nick Nystrom is Chief Scientist at the Pittsburgh Supercomputing Center (PSC). Nick is architect and PI for Bridges, PSC's flagship system that successfully pioneered the convergence of HPC, AI, and Big Data. He is also PI for the NIH Human Biomolecular Atlas Program’s HIVE Infrastructure Component and co-PI for projects that bring emerging AI technologies to research (Open Compass), apply machine learning to biomedical data for breast and lung cancer (Big Data for Better Health), and identify causal relationships in biomedical big data (the Center for Causal Discovery, an NIH Big Data to Knowledge Center of Excellence). His current research interests include hardware and software architecture, applications of machine learning to multimodal data (particularly for the life sciences) and to enhance simulation, and graph analytics.
Watch the video: https://youtu.be/LWEU1L1o7yY
Learn more: https://www.psc.edu/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses how DDN A3I storage solutions and Nvidia's SuperPOD platform can enable HPC at scale. It provides details on DDN's A3I appliances that are optimized for AI and deep learning workloads and validated for Nvidia's DGX-2 SuperPOD reference architecture. The solutions are said to deliver the fastest performance, effortless scaling, reliability and flexibility for data-intensive workloads.
FPGA-based soft processors are proposed as a solution for the challenges of 6G networks and post-quantum security. The document discusses challenges like convergence of operational and information technologies, limitations of existing solutions, and proposed architectures and solutions. Architectures like RISC-V are explored that are less vulnerable to obsolescence. Solutions proposed include common crypto and error correction code operations, flexibility to support different algebraic models for post-quantum security, and not being application specific. Future work areas include custom instruction sets for AI and fault tolerance verification for radiation hardened FPGAs.
The subject of this study is to show the application of fuzzy logic in image processing with a brief introduction to fuzzy logic and digital image processing.
This document summarizes a presentation about runtime mapping of hardware accelerators on an embedded field-programmable gate array (FPGA) layer. It discusses how the multicore era is hitting a utilization wall due to power constraints limiting the percentage of a chip that can operate at full frequency. It proposes heterogeneous multicores with reconfigurable hardware accelerators and a 3D-stacked design to improve bandwidth, latency, resource usage, performance and energy efficiency. The document outlines an embedded FPGA reconfigurable fabric and its expected features like low reconfiguration overhead through a double-context configuration memory. It describes how tasks can be more easily allocated and migrated in this eFPGA compared to a traditional FPGA through the use of a
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
Today Xilinx announced Versal Premium, the third series in the Versal ACAP portfolio. The Versal Premium series features highly integrated, networked and power-optimized cores and the industry’s highest bandwidth and compute density on an adaptable platform. Versal Premium is designed for the highest bandwidth networks operating in thermally and spatially constrained environments, as well as for cloud providers who need scalable, adaptable application acceleration.
Versal is the industry’s first adaptive compute acceleration platform (ACAP), a revolutionary new category of heterogeneous compute devices with capabilities that far exceed those of conventional silicon architectures. Developed on TSMC’s 7-nanometer process technology, Versal Premium combines software programmability with dynamically configurable hardware acceleration and pre-engineered connectivity and security features to enable a faster time-to- market. The Versal Premium series delivers up to 3X higher throughput compared to current generation FPGAs, with built-in Ethernet, Interlaken, and cryptographic engines that enable fast and secure networks. The series doubles the compute density of currently deployed mainstream FPGAs and provides the adaptability to keep pace with increasingly diverse and evolving cloud and networking workloads.
Learn more: https://insidehpc.com/2020/03/xilinx-announces-versal-premium-acap-for-network-and-cloud-acceleration/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document outlines the objectives and work to be done for the SystemX14 project. The project aims to develop an experimental platform for analyzing unstructured content like text, audio, and video from open sources using natural language processing and text mining software. Key objectives are to integrate these components, develop applications for information retrieval and open source monitoring, and account for user needs not met by existing applications. Work includes improving processing on noisy data, integrating new languages, interconnecting components, and addressing scalability, immediacy, and innovative user interfaces.
The BondMachine Toolkit, enabling machine learning on FPGAMirko Mariotti
The document discusses the BondMachine Toolkit, which enables machine learning on FPGA. It provides an overview of FPGAs and modern computer architectures. The BondMachine architecture dynamically generates computer architectures composed of many small heterogeneous cores on an FPGA. The cores are connected processors that can be configured and interconnected in different ways. The goal is to optimize architectures for specific computational problems like machine learning.
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...inside-BigData.com
In this deck from the Stanford HPC Conference, Nick Nystrom and Paola Buitrago provide an update from the Pittsburgh Supercomputing Center.
Nick Nystrom is Chief Scientist at the Pittsburgh Supercomputing Center (PSC). Nick is architect and PI for Bridges, PSC's flagship system that successfully pioneered the convergence of HPC, AI, and Big Data. He is also PI for the NIH Human Biomolecular Atlas Program’s HIVE Infrastructure Component and co-PI for projects that bring emerging AI technologies to research (Open Compass), apply machine learning to biomedical data for breast and lung cancer (Big Data for Better Health), and identify causal relationships in biomedical big data (the Center for Causal Discovery, an NIH Big Data to Knowledge Center of Excellence). His current research interests include hardware and software architecture, applications of machine learning to multimodal data (particularly for the life sciences) and to enhance simulation, and graph analytics.
Watch the video: https://youtu.be/LWEU1L1o7yY
Learn more: https://www.psc.edu/
and
http://www.hpcadvisorycouncil.com/events/2020/stanford-workshop/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document discusses how DDN A3I storage solutions and Nvidia's SuperPOD platform can enable HPC at scale. It provides details on DDN's A3I appliances that are optimized for AI and deep learning workloads and validated for Nvidia's DGX-2 SuperPOD reference architecture. The solutions are said to deliver the fastest performance, effortless scaling, reliability and flexibility for data-intensive workloads.
FPGA-based soft processors are proposed as a solution for the challenges of 6G networks and post-quantum security. The document discusses challenges like convergence of operational and information technologies, limitations of existing solutions, and proposed architectures and solutions. Architectures like RISC-V are explored that are less vulnerable to obsolescence. Solutions proposed include common crypto and error correction code operations, flexibility to support different algebraic models for post-quantum security, and not being application specific. Future work areas include custom instruction sets for AI and fault tolerance verification for radiation hardened FPGAs.
The subject of this study is to show the application of fuzzy logic in image processing with a brief introduction to fuzzy logic and digital image processing.
This document summarizes a presentation about runtime mapping of hardware accelerators on an embedded field-programmable gate array (FPGA) layer. It discusses how the multicore era is hitting a utilization wall due to power constraints limiting the percentage of a chip that can operate at full frequency. It proposes heterogeneous multicores with reconfigurable hardware accelerators and a 3D-stacked design to improve bandwidth, latency, resource usage, performance and energy efficiency. The document outlines an embedded FPGA reconfigurable fabric and its expected features like low reconfiguration overhead through a double-context configuration memory. It describes how tasks can be more easily allocated and migrated in this eFPGA compared to a traditional FPGA through the use of a
Versal Premium ACAP for Network and Cloud Accelerationinside-BigData.com
Today Xilinx announced Versal Premium, the third series in the Versal ACAP portfolio. The Versal Premium series features highly integrated, networked and power-optimized cores and the industry’s highest bandwidth and compute density on an adaptable platform. Versal Premium is designed for the highest bandwidth networks operating in thermally and spatially constrained environments, as well as for cloud providers who need scalable, adaptable application acceleration.
Versal is the industry’s first adaptive compute acceleration platform (ACAP), a revolutionary new category of heterogeneous compute devices with capabilities that far exceed those of conventional silicon architectures. Developed on TSMC’s 7-nanometer process technology, Versal Premium combines software programmability with dynamically configurable hardware acceleration and pre-engineered connectivity and security features to enable a faster time-to- market. The Versal Premium series delivers up to 3X higher throughput compared to current generation FPGAs, with built-in Ethernet, Interlaken, and cryptographic engines that enable fast and secure networks. The series doubles the compute density of currently deployed mainstream FPGAs and provides the adaptability to keep pace with increasingly diverse and evolving cloud and networking workloads.
Learn more: https://insidehpc.com/2020/03/xilinx-announces-versal-premium-acap-for-network-and-cloud-acceleration/
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
The document outlines the objectives and work to be done for the SystemX14 project. The project aims to develop an experimental platform for analyzing unstructured content like text, audio, and video from open sources using natural language processing and text mining software. Key objectives are to integrate these components, develop applications for information retrieval and open source monitoring, and account for user needs not met by existing applications. Work includes improving processing on noisy data, integrating new languages, interconnecting components, and addressing scalability, immediacy, and innovative user interfaces.
The BondMachine Toolkit, enabling machine learning on FPGAMirko Mariotti
The document discusses the BondMachine Toolkit, which enables machine learning on FPGA. It provides an overview of FPGAs and modern computer architectures. The BondMachine architecture dynamically generates computer architectures composed of many small heterogeneous cores on an FPGA. The cores are connected processors that can be configured and interconnected in different ways. The goal is to optimize architectures for specific computational problems like machine learning.
Automatic Generation of Dataflow-Based Reconfigurable Co-processing UnitsMDC_UNICA
Hardware accelerators are widely adopted to speed up computationally onerous applications. However their design is not trivial, especially if multiple applications/kernels need to be served. To this aim the Multi-Dataflow Composer (MDC) tool can be adopted to generate the internal computing core of flexible and reconfigurable hardware accelerators. Nevertheless, MDC is not able, as it is, to deploy ready to use accelerators. To address this lack, we conceived a fully automated design flow for coarse-grained reconfigurable and memory-mapped hardware accelerators, which required: a) the definition of a generic co-processing template, the Template Interface Layer; b) the extension of the MDC tool to characterize such a template and deploy the accelerators. This methodology represents, within MPEG Reconfigurable Video Coding studies, the first framework for the automatic generation of reconfigurable hardware accelerators and, as it will be discussed, it may be beneficial also in other contexts of execution with fairly limited adjustments. Results validated the proposed approach in a real use case scenario, comparing the automatically generated co-processor with a previous custom one.
The Multi-Dataflow Composer Tool: a Runtime Reconfigurable HDL Platform ComposerMDC_UNICA
The document summarizes a presentation given at the Conference on Design and Architectures for Signal and Image Processing (DASIP) in 2011. It discusses the Multi-Dataflow Composer (MDC) tool, which automatically composes different functional units specified using dataflow models of computation onto a coarse-grained reconfigurable hardware template. The MDC aims to close the gap between complex multi-purpose hardware platforms and their software programming by leveraging dataflow models and runtime reconfiguration. It allows developing runtime reconfigurable multi-purpose systems from high-level dataflow descriptions.
Even though the U.S. Department of Defense budget is shrinking and the country's military footprint worldwide is receding the need for the warfighter to have accurate and actionable intelligence has never been more critical. Data from Intelligence, Surveillance, and Reconnaissance (C4ISR) systems such as radar, image processing payloads on Unmanned Aerial Vehicles, and more will be used and fused together to provide commanders with real-time situational awareness. Each system will also need to embrace open architectures and the latest commercial standards to meet the DoD's performance, size, and cost requirements. This e-cast will discuss how embedded defense suppliers are meeting these challenges.
DDS Advanced Tutorial - OMG June 2013 Berlin MeetingJaime Martin Losa
An extended, in-depth tutorial explaining how to fully exploit the standard's unique communication capabilities.Presented at the OMG June 2013 Berlin Meeting.
Users upgrading to DDS from a homegrown solution or a legacy-messaging infrastructure often limit themselves to using its most basic publish-subscribe features. This allows applications to take advantage of reliable multicast and other performance and scalability features of the DDS wire protocol, as well as the enhanced robustness of the DDS peer-to-peer architecture. However, applications that do not use DDS's data-centricity do not take advantage of many of its QoS-related, scalability and availability features, such as the KeepLast History Cache, Instance Ownership and Deadline Monitoring. As a consequence some developers duplicate these features in custom application code, resulting in increased costs, lower performance, and compromised portability and interoperability.
This tutorial will formally define the data-centric publish-subscribe model as specified in the OMG DDS specification and define a set of best-practice guidelines and patterns for the design and implementation of systems based on DDS.
A DHT Chord-like mannered overlay-network to store and retrieve dataAndrea Tino
The document describes a distributed data backup and recovery system implemented as a distributed hash table on a ring topology overlay network. Servers in the network store and retrieve data units which are addressed and routed through the network based on their content hashes. The system was designed for scalability, flexibility, and extensibility through a library of reusable "proto-services" that can connect to each other to enable various functionality and behaviors.
What is expected from Chief Cloud Officers?Bernard Paques
The new CxO is taking care of cloud computing for his company. Among his responsabilities: brand experience, go-to-market and business agility. What do these mean in terms of capabilities?
zenoh -- the ZEro Network OverHead protocolAngelo Corsaro
This document introduces Zenoh, a new data-centric networking protocol being developed by ADLINK Tech. Inc. Zenoh allows applications to asynchronously read and write data associated with URIs and supports pub/sub and storage/query models. It is designed to unify data sharing across all types of devices, including extremely constrained ones. Zenoh provides reliability, load balancing, and a small protocol footprint to enable connectivity in resource-limited environments. A live demo is available to showcase Zenoh's capabilities.
A significant proportion of developments in the Internet of Things (IoT) is driven by non-technical innovators and ambitious hobbyists. Node-RED targets this audience and offers a widely used rapid prototyping platform for IoT data plumbing on the basis of JavaScript. Data platforms for the IoT provide storage facilities and value in the form of visualisation & analytics to business and end users alike. This report details how Node-RED connects to 11 different platforms and what additional services these provide.
MPLS/SDN 2013 Intercloud Standardization and Testbeds - SillAlan Sill
This talk givens an overview of several multi-SDO and cross-SDO activities to promote and spur innovation in cloud computing. The focus is on API development and standardization, including testbeds, test use cases, and collaborative activities between organizations to create and carry out development and testing in this area. The focus is on work being pursued through the Cloud and Autonomic Computing Center at Texas Tech University, which is part of the US National Science Foundation's Industry/University Cooperative Research Center, and on work being done by standards organizations such as the Open Grid Forum, Distributed Management Task Force, and Telecommunications Management Forum in which the CAC@TTU is involved. A summary is also given of work to produce a new round of more detailed use cases suitable for testing by the US National Institute of Standards and Technology's Standards Acceleration to Jumpstart Adoption of Cloud Computing (SAJACC) working group, with brief mention also given to other related work going on in this area in other parts of the world. Background and other standards work is also mentioned.
The goal of the project “An optic’s life” is, to predict the time when an optical transceiver will reach its real end-of-life-time based on the actual setup in the datacenter / colocation.
Field Programmable Gate Arrays (FPGAs) are configurable integrated circuits able to provide a good trade-off in terms of performance, power consumption, and flexibility with respect to other architectures, like CPUs, GPUs and ASICs. The main drawback in using FPGAs, however, is their steep learning curve. An emerging solution to this problem is to write algorithms in a Domain Specific Language (DSL) and to let the DSL compiler generate efficient code targeting FPGAs. This work proposes FROST, a unified backend that enables different DSL compilers to target FPGA architectures. Differently from other code generation frameworks targeting FPGA, FROST exploits a scheduling co-language that enables users to have full control over which optimizations to apply in order to generate efficient code (e.g. loop pipelining, array partitioning, vectorization). At first, FROST analyzes and manipulates the input Abstract Syntax Tree (AST) in order to apply FPGA-oriented transformations and optimizations, then generates a C/C++ implementation suitable for High-Level Synthesis (HLS) tools. Finally, the output of HLS phase is synthesized and implemented on the target FPGA using Xilinx SDAccel toolchain.
FROST currently supports as front-end Halide, an Image Processing DSL, and Tiramisu, a DSL optimizer, and allows to achieve significant speedups with respect to state-of-the-art FPGA implementations of the same algorithms.
Garold Scantlen has over 30 years of experience in the electronics industry, including most recently working as an independent consultant. He has expertise in areas like system design, software development, and engineering. Scantlen has worked on projects involving high-performance computing systems, embedded systems, and data collection/visualization software. He also has experience supporting defense contractors with tasks like designing data loggers, radio interfaces, and training materials.
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...Denodo
To watch full webinar, follow this link: https://goo.gl/3s9hRG
The tide is changing for analytics architectures. Traditional approaches, from the data warehouse to the data lake, implicitly assume that all relevant data can be stored in a single, centralized repository. But this approach is slow and expensive, and sometimes not even feasible, because some data sources are too big to be replicated, and data is often too distributed such as those found in cloud data sources to make a “full centralization” strategy successful.
Attend this webinar to learn:
• Why Logical architectures are the best option when integrating Big Data.
• How Denodo’s parallel in-memory capabilities with dynamic query optimization redefine analytics architectures.
• How IT can meet business demands for data much faster with Data Virtualization.
Agenda:
• Challenges with traditional approaches for analytics architectures.
• Overview of Denodo's parallel in-memory capabilities.
• Product Demo of parallel in-memory capabilities accelerating analytics performance.
• Q&A.
To watch all webinars in Denodo's Packed Lunch Webinar Series, follow this link: https://goo.gl/4xL9wM
This document presents an open source sensor network framework developed by NTT DATA Italia S.p.A. The framework allows for building sensor networks using Arduino and Raspberry Pi hardware. Sensor data is collected by Arduinos and sent to Raspberry Pis via serial communication using a standard protocol. The data is stored in a MongoDB database and made available via web services. A web interface and mobile apps allow users to interact with and analyze the sensor data. Examples applications mentioned include environmental monitoring, manufacturing, automotive, energy, and healthcare.
Rolta is an IT company that provides innovative solutions across various industries including utilities, transportation, process, power, banking and insurance. It also offers solutions for defense and homeland security through software for mapping and earth sciences. Rolta worked with Yanbu National Petrochemical Company to configure and implement SmartPlant Foundation for engineering data and document management by integrating it with Aspen Zyqad and ensuring YANSAB standards were followed.
Nicholas Gadd has included a portfolio of projects demonstrating his skills in electronic engineering, robotics, software development and more. The portfolio includes projects involving designing and building a high-performance drumming robot, modifying an operating system to improve performance, and writing automated Python scripts to integrate data from various network management tools at the Ministry of Social Development.
zenoh -- the ZEro Network OverHead protocolAngelo Corsaro
This presentation introduces the key ideas behind zenoh -- an Internet scale data-centric protocol that unifies data-sharing between any kind of device including those constrained with respect to the node resources, such as computational resources and power, as well as the network.
Ontology Summit - Track D Standards Summary & Provocative Use CasesMark Underwood
The OntologySummit is an annual series of events (first started by Ontolog and NIST in 2006) that involves the ontology community and communities related to each year's theme chosen for the summit. The Ontology Summit program is now co-organized by Ontolog, NIST, NCOR, NCBO, IAOA, NCO_NITRD along with the co-sponsorship of other organizations that are supportive of the Summit goals and objectives. This deck summarizes some of the work in Track D, IoT and Ontology Standards Synergies
Many HPC applications are massively parallel and can benefit from the spatial parallelism offered by reconfigurable logic. While modern memory technologies can offer high bandwidth, designers must craft advanced communication and memory architectures for efficient data movement and on-chip storage. Addressing these challenges requires to combine compiler optimizations, high-level synthesis, and hardware design.
In this talk, I will present challenges, solutions, and trends for generating massively parallel accelerators on FPGA for high-performance computing. These architectures can provide performance comparable to software implementations on high-end processors, and much higher energy efficiency thanks to logic customization.
The Reconfigurable Video Coding (RVC) framework is a recent ISO standard aiming at providing a unified specification of MPEG video technology in the form of a library of components. The word “reconfigurable” evokes run-time instantiation of different decoders starting from an on-the-fly analysis of the input bitstream.
In this paper we move a first step towards the definition of systematic procedures that, based on the MPEG RVC specification formalism, are able to produce multi-decoder platforms, capable of fast switching between different configurations. Looking at the similarities between the decoding algorithms to implement, the papers describes an automatic tool for their composition into a single configurable multi-decoder built of all the required modules, and able to reuse the shared components so as to reduce the overall footprint (either from a hardware or software perspective).
The proposed approach, implemented in C++ leveraging on Flex and Bison code generation tools, typically exploited in the compilers front-end, demonstrates to be successful in the composition of two different decoders MPEG-4 Part 2 (SP): serial and parallel.
A Coarse-Grained Reconfigurable Wavelet Denoiser Exploiting the Multi-Dataflo...MDC_UNICA
In the last few years, efficient resource management turned out to be one of the major challenges for hardware designers. Strategies of reusability through reconfiguration have demonstrated interesting potentials to address it, providing also power and area minimization. The Multi-Dataflow Composer (MDC) tool has been presented to the scientific community to automatically build-up runtime coarse-grained reconfigurable platforms. Originally conceived in the field of Reconfigurable Video Coding (RVC), the MDC allows achieving attractive results also for hardware designers operating in different application fields. In this work, we intend to demonstrate the potential orthogonality of the MDC approach with respect to the RVC domain. A runtime reconfigurable wavelet denoiser, targeted for biomedical applications, has been developed and prototyped onto an FPGA Development Board Spartan 3E 1600. Impressive results in terms of resource minimization have been achieved with respect to more traditional solutions.
More Related Content
Similar to DSE and Profiling of Multi-Context Coarse-Grained Reconfigurable Systems
Automatic Generation of Dataflow-Based Reconfigurable Co-processing UnitsMDC_UNICA
Hardware accelerators are widely adopted to speed up computationally onerous applications. However their design is not trivial, especially if multiple applications/kernels need to be served. To this aim the Multi-Dataflow Composer (MDC) tool can be adopted to generate the internal computing core of flexible and reconfigurable hardware accelerators. Nevertheless, MDC is not able, as it is, to deploy ready to use accelerators. To address this lack, we conceived a fully automated design flow for coarse-grained reconfigurable and memory-mapped hardware accelerators, which required: a) the definition of a generic co-processing template, the Template Interface Layer; b) the extension of the MDC tool to characterize such a template and deploy the accelerators. This methodology represents, within MPEG Reconfigurable Video Coding studies, the first framework for the automatic generation of reconfigurable hardware accelerators and, as it will be discussed, it may be beneficial also in other contexts of execution with fairly limited adjustments. Results validated the proposed approach in a real use case scenario, comparing the automatically generated co-processor with a previous custom one.
The Multi-Dataflow Composer Tool: a Runtime Reconfigurable HDL Platform ComposerMDC_UNICA
The document summarizes a presentation given at the Conference on Design and Architectures for Signal and Image Processing (DASIP) in 2011. It discusses the Multi-Dataflow Composer (MDC) tool, which automatically composes different functional units specified using dataflow models of computation onto a coarse-grained reconfigurable hardware template. The MDC aims to close the gap between complex multi-purpose hardware platforms and their software programming by leveraging dataflow models and runtime reconfiguration. It allows developing runtime reconfigurable multi-purpose systems from high-level dataflow descriptions.
Even though the U.S. Department of Defense budget is shrinking and the country's military footprint worldwide is receding the need for the warfighter to have accurate and actionable intelligence has never been more critical. Data from Intelligence, Surveillance, and Reconnaissance (C4ISR) systems such as radar, image processing payloads on Unmanned Aerial Vehicles, and more will be used and fused together to provide commanders with real-time situational awareness. Each system will also need to embrace open architectures and the latest commercial standards to meet the DoD's performance, size, and cost requirements. This e-cast will discuss how embedded defense suppliers are meeting these challenges.
DDS Advanced Tutorial - OMG June 2013 Berlin MeetingJaime Martin Losa
An extended, in-depth tutorial explaining how to fully exploit the standard's unique communication capabilities.Presented at the OMG June 2013 Berlin Meeting.
Users upgrading to DDS from a homegrown solution or a legacy-messaging infrastructure often limit themselves to using its most basic publish-subscribe features. This allows applications to take advantage of reliable multicast and other performance and scalability features of the DDS wire protocol, as well as the enhanced robustness of the DDS peer-to-peer architecture. However, applications that do not use DDS's data-centricity do not take advantage of many of its QoS-related, scalability and availability features, such as the KeepLast History Cache, Instance Ownership and Deadline Monitoring. As a consequence some developers duplicate these features in custom application code, resulting in increased costs, lower performance, and compromised portability and interoperability.
This tutorial will formally define the data-centric publish-subscribe model as specified in the OMG DDS specification and define a set of best-practice guidelines and patterns for the design and implementation of systems based on DDS.
A DHT Chord-like mannered overlay-network to store and retrieve dataAndrea Tino
The document describes a distributed data backup and recovery system implemented as a distributed hash table on a ring topology overlay network. Servers in the network store and retrieve data units which are addressed and routed through the network based on their content hashes. The system was designed for scalability, flexibility, and extensibility through a library of reusable "proto-services" that can connect to each other to enable various functionality and behaviors.
What is expected from Chief Cloud Officers?Bernard Paques
The new CxO is taking care of cloud computing for his company. Among his responsabilities: brand experience, go-to-market and business agility. What do these mean in terms of capabilities?
zenoh -- the ZEro Network OverHead protocolAngelo Corsaro
This document introduces Zenoh, a new data-centric networking protocol being developed by ADLINK Tech. Inc. Zenoh allows applications to asynchronously read and write data associated with URIs and supports pub/sub and storage/query models. It is designed to unify data sharing across all types of devices, including extremely constrained ones. Zenoh provides reliability, load balancing, and a small protocol footprint to enable connectivity in resource-limited environments. A live demo is available to showcase Zenoh's capabilities.
A significant proportion of developments in the Internet of Things (IoT) is driven by non-technical innovators and ambitious hobbyists. Node-RED targets this audience and offers a widely used rapid prototyping platform for IoT data plumbing on the basis of JavaScript. Data platforms for the IoT provide storage facilities and value in the form of visualisation & analytics to business and end users alike. This report details how Node-RED connects to 11 different platforms and what additional services these provide.
MPLS/SDN 2013 Intercloud Standardization and Testbeds - SillAlan Sill
This talk givens an overview of several multi-SDO and cross-SDO activities to promote and spur innovation in cloud computing. The focus is on API development and standardization, including testbeds, test use cases, and collaborative activities between organizations to create and carry out development and testing in this area. The focus is on work being pursued through the Cloud and Autonomic Computing Center at Texas Tech University, which is part of the US National Science Foundation's Industry/University Cooperative Research Center, and on work being done by standards organizations such as the Open Grid Forum, Distributed Management Task Force, and Telecommunications Management Forum in which the CAC@TTU is involved. A summary is also given of work to produce a new round of more detailed use cases suitable for testing by the US National Institute of Standards and Technology's Standards Acceleration to Jumpstart Adoption of Cloud Computing (SAJACC) working group, with brief mention also given to other related work going on in this area in other parts of the world. Background and other standards work is also mentioned.
The goal of the project “An optic’s life” is, to predict the time when an optical transceiver will reach its real end-of-life-time based on the actual setup in the datacenter / colocation.
Field Programmable Gate Arrays (FPGAs) are configurable integrated circuits able to provide a good trade-off in terms of performance, power consumption, and flexibility with respect to other architectures, like CPUs, GPUs and ASICs. The main drawback in using FPGAs, however, is their steep learning curve. An emerging solution to this problem is to write algorithms in a Domain Specific Language (DSL) and to let the DSL compiler generate efficient code targeting FPGAs. This work proposes FROST, a unified backend that enables different DSL compilers to target FPGA architectures. Differently from other code generation frameworks targeting FPGA, FROST exploits a scheduling co-language that enables users to have full control over which optimizations to apply in order to generate efficient code (e.g. loop pipelining, array partitioning, vectorization). At first, FROST analyzes and manipulates the input Abstract Syntax Tree (AST) in order to apply FPGA-oriented transformations and optimizations, then generates a C/C++ implementation suitable for High-Level Synthesis (HLS) tools. Finally, the output of HLS phase is synthesized and implemented on the target FPGA using Xilinx SDAccel toolchain.
FROST currently supports as front-end Halide, an Image Processing DSL, and Tiramisu, a DSL optimizer, and allows to achieve significant speedups with respect to state-of-the-art FPGA implementations of the same algorithms.
Garold Scantlen has over 30 years of experience in the electronics industry, including most recently working as an independent consultant. He has expertise in areas like system design, software development, and engineering. Scantlen has worked on projects involving high-performance computing systems, embedded systems, and data collection/visualization software. He also has experience supporting defense contractors with tasks like designing data loggers, radio interfaces, and training materials.
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...Denodo
To watch full webinar, follow this link: https://goo.gl/3s9hRG
The tide is changing for analytics architectures. Traditional approaches, from the data warehouse to the data lake, implicitly assume that all relevant data can be stored in a single, centralized repository. But this approach is slow and expensive, and sometimes not even feasible, because some data sources are too big to be replicated, and data is often too distributed such as those found in cloud data sources to make a “full centralization” strategy successful.
Attend this webinar to learn:
• Why Logical architectures are the best option when integrating Big Data.
• How Denodo’s parallel in-memory capabilities with dynamic query optimization redefine analytics architectures.
• How IT can meet business demands for data much faster with Data Virtualization.
Agenda:
• Challenges with traditional approaches for analytics architectures.
• Overview of Denodo's parallel in-memory capabilities.
• Product Demo of parallel in-memory capabilities accelerating analytics performance.
• Q&A.
To watch all webinars in Denodo's Packed Lunch Webinar Series, follow this link: https://goo.gl/4xL9wM
This document presents an open source sensor network framework developed by NTT DATA Italia S.p.A. The framework allows for building sensor networks using Arduino and Raspberry Pi hardware. Sensor data is collected by Arduinos and sent to Raspberry Pis via serial communication using a standard protocol. The data is stored in a MongoDB database and made available via web services. A web interface and mobile apps allow users to interact with and analyze the sensor data. Examples applications mentioned include environmental monitoring, manufacturing, automotive, energy, and healthcare.
Rolta is an IT company that provides innovative solutions across various industries including utilities, transportation, process, power, banking and insurance. It also offers solutions for defense and homeland security through software for mapping and earth sciences. Rolta worked with Yanbu National Petrochemical Company to configure and implement SmartPlant Foundation for engineering data and document management by integrating it with Aspen Zyqad and ensuring YANSAB standards were followed.
Nicholas Gadd has included a portfolio of projects demonstrating his skills in electronic engineering, robotics, software development and more. The portfolio includes projects involving designing and building a high-performance drumming robot, modifying an operating system to improve performance, and writing automated Python scripts to integrate data from various network management tools at the Ministry of Social Development.
zenoh -- the ZEro Network OverHead protocolAngelo Corsaro
This presentation introduces the key ideas behind zenoh -- an Internet scale data-centric protocol that unifies data-sharing between any kind of device including those constrained with respect to the node resources, such as computational resources and power, as well as the network.
Ontology Summit - Track D Standards Summary & Provocative Use CasesMark Underwood
The OntologySummit is an annual series of events (first started by Ontolog and NIST in 2006) that involves the ontology community and communities related to each year's theme chosen for the summit. The Ontology Summit program is now co-organized by Ontolog, NIST, NCOR, NCBO, IAOA, NCO_NITRD along with the co-sponsorship of other organizations that are supportive of the Summit goals and objectives. This deck summarizes some of the work in Track D, IoT and Ontology Standards Synergies
Many HPC applications are massively parallel and can benefit from the spatial parallelism offered by reconfigurable logic. While modern memory technologies can offer high bandwidth, designers must craft advanced communication and memory architectures for efficient data movement and on-chip storage. Addressing these challenges requires to combine compiler optimizations, high-level synthesis, and hardware design.
In this talk, I will present challenges, solutions, and trends for generating massively parallel accelerators on FPGA for high-performance computing. These architectures can provide performance comparable to software implementations on high-end processors, and much higher energy efficiency thanks to logic customization.
Similar to DSE and Profiling of Multi-Context Coarse-Grained Reconfigurable Systems (20)
The Reconfigurable Video Coding (RVC) framework is a recent ISO standard aiming at providing a unified specification of MPEG video technology in the form of a library of components. The word “reconfigurable” evokes run-time instantiation of different decoders starting from an on-the-fly analysis of the input bitstream.
In this paper we move a first step towards the definition of systematic procedures that, based on the MPEG RVC specification formalism, are able to produce multi-decoder platforms, capable of fast switching between different configurations. Looking at the similarities between the decoding algorithms to implement, the papers describes an automatic tool for their composition into a single configurable multi-decoder built of all the required modules, and able to reuse the shared components so as to reduce the overall footprint (either from a hardware or software perspective).
The proposed approach, implemented in C++ leveraging on Flex and Bison code generation tools, typically exploited in the compilers front-end, demonstrates to be successful in the composition of two different decoders MPEG-4 Part 2 (SP): serial and parallel.
A Coarse-Grained Reconfigurable Wavelet Denoiser Exploiting the Multi-Dataflo...MDC_UNICA
In the last few years, efficient resource management turned out to be one of the major challenges for hardware designers. Strategies of reusability through reconfiguration have demonstrated interesting potentials to address it, providing also power and area minimization. The Multi-Dataflow Composer (MDC) tool has been presented to the scientific community to automatically build-up runtime coarse-grained reconfigurable platforms. Originally conceived in the field of Reconfigurable Video Coding (RVC), the MDC allows achieving attractive results also for hardware designers operating in different application fields. In this work, we intend to demonstrate the potential orthogonality of the MDC approach with respect to the RVC domain. A runtime reconfigurable wavelet denoiser, targeted for biomedical applications, has been developed and prototyped onto an FPGA Development Board Spartan 3E 1600. Impressive results in terms of resource minimization have been achieved with respect to more traditional solutions.
Automated Design Flow for Coarse-Grained Reconfigurable Platforms: an RVC-CAL...MDC_UNICA
pecialized hardware infrastructures for efficient multi-application runtime reconfigurable platforms require to address several issues. The higher is the system complexity, the more error prone and time consuming is the entire design flow. Moreover, system configuration along with resource management and mapping are challenging, especially when runtime adaptivity is required. In order to address these issues, the Reconfigurable Video Coding Group within the MPEG group has developed the MPEG RMC standards ISO/IEC 23001-4 and 23002-4, based on the dataflow Model of Computation. In this paper, we propose an integrated design flow, leveraging on Xronos, TURNUS, and the Multi-Dataflow Composer tool, capable of automatic synthesis and mapping of reconfigurable systems. In particular, an RVC MPEG-4 SP decoder and the RVC Intra MPEG-4 SP decoder have been implemented on the same coarse-grained reconfigurable platform, targeting a Xilinx Virtex 5 330 FPGA board. Results confirmed the potentiality of the approach, capable of completely preserving the single decoders functionality and of providing, in addition, considerable power/area benefits with respect to the parallel implementation of the considered decoders on the same platform.
Power-Awarness in Coarse-Grained Reconfigurable Designs: a Dataflow Based Str...MDC_UNICA
Applications and hardware complexity management in modern systems tend to collide with efficient resource and power balance. Therefore, dedicated and power-aware design frameworks are necessary to implement efficient multi-functional runtime reconfigurable signal processing platforms. In this work, we adopt dataflow specifications as a starting point to challenge power minimization.
Automated Power Gating Methodology for Dataflow-Based Reconfigurable SystemsMDC_UNICA
Modern embedded systems designers are required to implement efficient multi-functional applications, over portable platforms under strong energy and resources constraints. Automatic tools may help them in challenging such a complex scenario: to develop complex reconfigurable systems while reducing time-to-market. At the same time, automated methodologies can aid them to manage power consumption. Dataflow models of computation, thanks to their modularity, turned out to be extremely useful to these purposes. In this paper, we will demonstrate as they can be used to automatically achieve power management since the earliest stage of the design flow. In particular, we are focussing on the automation of power gating. The methodology has been evaluated on an image processing use case targeting an ASIC 90 nm CMOS technology.
Reconfigurable Coprocessors Synthesis in the MPEG-RVC DomainMDC_UNICA
Flexibility and high efficiency are common design drivers in the embedded systems domain. Coarse-grained reconfigurable coprocessors can tackle these issues, but they suffer of complex design, debugging and applications mapping problems. In this paper, we propose an automated design flow that aids developers in design and managing coarse-grained reconfigurable coprocessors. It provides both the hardware IP and the software drivers, featuring two different levels of coupling with the host processor. The presented solution has been tested on a JPEG codec, targeting a commercial Xilinx Virtex-5 FPGA.
Power Modelling for Saving Strategies in Coarse Grained Reconfigurable SystemsMDC_UNICA
In the context of coarse-grained reconfigurable systems we present a power estimation model to guide the designer in deciding which part of the design may benefit from the application of a power gating technique. The model is assessed by adopting a reconfigurable core for image processing targeting an ASIC 90 nm technology.
Adaptable AES Implementation with Power-Gating SupportMDC_UNICA
The document proposes an adaptable AES implementation with power-gating support. It includes a multi-profile AES design that can reconfigure based on quality of service and battery life requirements. Different configurations offer varying levels of security, latency, and power consumption. The design also uses power gating techniques like sleep transistors and isolation cells to reduce power consumption. Evaluation shows the approach provides a flexible AES solution with efficient power management for Internet of Things devices.
Power and Clock Gating Modelling in Coarse Grained Reconfigurable SystemsMDC_UNICA
Power reduction is one of the biggest challenges in modern systems and tends to become a severe issue dealing with complex scenarios. To provide high-performance and exibility, designers often opt for coarse-grained recongurable (CGR) systems. Nevertheless, these systems require specic attention to the power problem, since large set of resources may be underutilized while computing a certain task.
This paper focuses on this issue. Targeting CGR devices, we propose a way to model in advance power and clock gating costs on the basis of the functional, technological and architectural parameters of the baseline CGR system. The
proposed flow guides designers towards optimal implementations, saving designer eort and time.
The RPCT project aims to define a software framework providing an efficient support for the automatic generation and management of dedicated reconfigurable systems. The starting points of this framework are the high-level specifications provided as dataflow models.
The main outcome of this research is the Multi Dataflow Composer (MDC) tool, which automatically generates a reconfigurable hardware platform. The MDC tool is extremely suitable in video and image processing contexts, portable platforms and bio-medical signal processing applications (i.e. implantable or wearable devices).
The RPCT research project has been developed within the Microelectronics and Bioengineering Lab (EOLAB) of the Department of Electrical and Electronic Engineering at the University of Cagliari.
Building a Raspberry Pi Robot with Dot NET 8, Blazor and SignalRPeter Gallagher
In this session delivered at NDC Oslo 2024, I talk about how you can control a 3D printed Robot Arm with a Raspberry Pi, .NET 8, Blazor and SignalR.
I also show how you can use a Unity app on an Meta Quest 3 to control the arm VR too.
You can find the GitHub repo and workshop instructions here;
https://bit.ly/dotnetrobotgithub
Google Calendar is a versatile tool that allows users to manage their schedules and events effectively. With Google Calendar, you can create and organize calendars, set reminders for important events, and share your calendars with others. It also provides features like creating events, inviting attendees, and accessing your calendar from mobile devices. Additionally, Google Calendar allows you to embed calendars in websites or platforms like SlideShare, making it easier for others to view and interact with your schedules.
DSE and Profiling of Multi-Context Coarse-Grained Reconfigurable Systems
1. 8th International Symposium on Image and Signal Processing and Analysis - 2013
September 4th-6th, 2013,
Trieste, Italy
Special Session on Hardware-software Co-design Methodologies for Streaming
Processing in Digital Media Technologies
Design Space Exploration and Profiling of
Multi-Context Coarse-Grained
Reconfigurable Systems
Francesca Palumbo, Carlo Sau and Luigi Raffo
EOLAB - Microelectronics Lab
DIEE - Dept. of Electrical and Electronic Eng.
University of Cagliari - ITALY
3. ISPA2013–September4th-6th–Trieste,Italy
Background
• Systems and applications on the market are becoming every
day more complex and power hungry.
• Flexibility, Portability, wearability, implantability, battery life
limits, real-time along with computational correctness needs
to be taken into account.
ICT TRENDS
•Ubiquitous access
•Personalized services
•Delocalized computing and storage
•Massive data processing systems
•High-quality virtual reality
•Intelligent sensing
•High-performance real-time
embedded computing
EXAMPLES
•Domestic robot
•Telepresence
•The car of the future
•Aerospace and avionics
•Human ++
•Computational science
•Realistic games
•Smart camera networks
SOURCE: http://www.hipeac.net/roadmap
4. ISPA2013–September4th-6th–Trieste,Italy
Background
• Systems and applications on the market are becoming every
day more complex and power hungry.
• Flexibility, Portability, wearability, implantability, battery life
limits, real-time along with computational correctness needs
to be taken into account.
SOURCE: http://www.hipeac.net/roadmap
INTEGRATION, SPECIALIZATION and HIGH PERFORMANCE
REQUIREMENTS
in such
COMPLEX COMPUTATIONAL HUNGRY ENVIRONMENTS
has threatened the
TRADITIONAL DESIGN FLOW.
6. ISPA2013–September4th-6th–Trieste,Italy
• High-level dataflow combination tool, front-end
of the actual MDC tool. [DASIP 2010]
Research Evolution
Reconfigurable Video Coding modularity coupled with hw
reconfiguration has been exploited to efficiently map on
an unique substrate multiple applications (Multi-Dataflow
Composer tool).
7. ISPA2013–September4th-6th–Trieste,Italy
• High-level dataflow combination tool, front-end
of the actual MDC tool. [DASIP 2010]
• Multi-Dataflow Composer (MDC) tool: concrete
definition of the hardware template and of the
D-MoC based mapping strategy. [DASIP 2011,
JRTIP]
Research Evolution
Reconfigurable Video Coding modularity coupled with hw
reconfiguration has been exploited to efficiently map on
an unique substrate multiple applications (Multi-Dataflow
Composer tool).
8. ISPA2013–September4th-6th–Trieste,Italy
• High-level dataflow combination tool, front-end
of the actual MDC tool. [DASIP 2010]
• Multi-Dataflow Composer (MDC) tool: concrete
definition of the hardware template and of the
D-MoC based mapping strategy. [DASIP 2011,
JRTIP]
• Integration of the full high-level to hw
composition and generation framework. [ISCAS
2012]
Research Evolution
Reconfigurable Video Coding modularity coupled with hw
reconfiguration has been exploited to efficiently map on
an unique substrate multiple applications (Multi-Dataflow
Composer tool).
14. ISPA2013–September4th-6th–Trieste,Italy
MDC Approach: the tool
FRONTEND
BACKEND
HDL
component
library
Directed Flow
Graph (DFG)
A B C A
E C
D
Single Application
Dataflows (SADs)
MDC
Coarse-Grained
output
A
B
E
C
D
SBOX
SBOX
MDCtool
appID
• The Multi-Dataflow Composer
(MDC) tool IS an automatic
platform generator combining
different dataflow networks
(SADs) on a coarse-grained
reconfigurable template.
• The MDC IS responsible of
providing runtime
programmability of the hw
substrate to switch among
given the SADs.
• The MDC IS NOT capable of
High Level Synthesis from
SAD to hw.
15. ISPA2013–September4th-6th–Trieste,Italy
MDC Approach: the tool
FRONTEND
BACKEND
HDL
component
library
Directed Flow
Graph (DFG)
A B C A
E C
D
Single Application
Dataflows (SADs)
MDC
Coarse-Grained
output
A
B
E
C
D
SBOX
SBOX
MDCtool
appID
• The Multi-Dataflow Composer
(MDC) tool IS an automatic
platform generator combining
different dataflow networks
(SADs) on a coarse-grained
reconfigurable template.
• The MDC IS responsible of
providing runtime
programmability of the hw
substrate to switch among
given the SADs.
• The MDC IS NOT capable of
High Level Synthesis from
SAD to hw.
benefits of the MDC approach have been already
demonstrated
[F. Palumbo et.al. ,“The multi-dataflow composer tool: generation of on-the-fly reconfigurable platforms”, in Jrnl of Real
Time Image Processing]
BUT
an early stage trade-off analysis may be extremely
useful
24. ISPA2013–September4th-6th–Trieste,Italy
The MDC DSE and profiler
SAD
COMBINATIONSGENERATOR SAD
SAD
SAD
MDCTOOL
LOW-LEVELFEEDBACKANALYSIS
DFG
DFG
DFG
COMBINATIONS
GENERATOR
MDCTOOL
LOW-LEVEL
FEEDBACKANALYSIS
PARETOANALYSIS
TIMING
TIMINGTIMING
TIMING
TIMINGAREA
TIMING
TIMINGPOWER
25. ISPA2013–September4th-6th–Trieste,Italy
The MDC DSE and profiler
SAD
COMBINATIONSGENERATOR SAD
SAD
SAD
MDCTOOL
LOW-LEVELFEEDBACKANALYSIS
DFG
DFG
DFG
PARETOANALYSIS
OPTIMAL
DFG
COMBINATIONS
GENERATOR
MDCTOOL
LOW-LEVEL
FEEDBACKANALYSIS
PARETOANALYSIS
TIMING
TIMINGTIMING
TIMING
TIMINGAREA
TIMING
TIMINGPOWER
26. ISPA2013–September4th-6th–Trieste,Italy
The MDC DSE and profiler
SAD
COMBINATIONSGENERATOR SAD
SAD
SAD
MDCTOOL
LOW-LEVELFEEDBACKANALYSIS
DFG
DFG
DFG
PARETOANALYSIS
OPTIMAL
DFG
COMBINATIONS
GENERATOR
MDCTOOL
LOW-LEVEL
FEEDBACKANALYSIS
PARETOANALYSIS
MDC DSE and profiler
TIMING
TIMINGTIMING
TIMING
TIMINGAREA
TIMING
TIMINGPOWER
27. ISPA2013–September4th-6th–Trieste,Italy
Number and type of combinations:
• D_notMer: not merged composition of the N input SADs in
placed in parallel;
• D_Mer: as much resources as possible are shared merging
together all the N input SADs;
• D_partMer: it is not maximized resource sharing; any DFG
is composed of i not merged SADs in parallel with one of
the permutations f the other N-i.
Combinations Generator (1)
COMBINATIONS
GENERATOR
MDCTOOL
LOW-LEVEL
FEEDBACKANALYSIS
PARETOANALYSIS
D = D_notMer + D_Mer + D_partMer = 1 + N! +Σ π j
N-1 N
K=2 j=k
32. ISPA2013–September4th-6th–Trieste,Italy Low-level Feedback Analysis:
area and power
V = set of vertices viЄV
COMBINATIONS
GENERATOR
MDCTOOL
LOW-LEVEL
FEEDBACKANALYSIS
PARETOANALYSIS
ai = Area(vi) Area(DFG) = Σ ai
pi = Power(vi) Power(DFG) = Σ pi
DFG = <V,E>
E = set of edges eiЄE
37. ISPA2013–September4th-6th–Trieste,Italy
CP_seqSB=Σ cpSi
Low-level Feedback Analysis:
frequency
COMBINATIONS
GENERATOR
MDCTOOL
LOW-LEVEL
FEEDBACKANALYSIS
PARETOANALYSIS
•cpNj=CP of the j-th SAD
staticCP=max(cpNj)
•cpSi=CP of the i-th Sbox
CP(DFG) =max(Cpstatic,CP_seqSB)
38. ISPA2013–September4th-6th–Trieste,Italy
Design Under Test
SADs UC1 UC2 UC3
Qsort X - X
Min_Max X X X
Corr X - X
Abs X X X
Rgb2Ycc X - X
Ycc2Rgb X - X
Sbwlabel - X X
Median - X X
Cubic - X X
Cubic_Conv - X X
Check_Gener
alBilevel
- X X
UC1: Antialiasing
Table: composition of the different analyzed
use cases.
UC2: Zoom
UC3: Antialiasing
& Zoom
host
processor
image
and video
coprocessor
hal
[F. Palumbo et.al. ,“The multi-dataflow
composer tool: generation of on-the-fly
reconfigurable platforms”, in Jrnl of Real Time
Image Processing]
43. ISPA2013–September4th-6th–Trieste,Italy
UC1 - Pareto Analysis
UC1: Antialiasing
Involved SADs:
6
Design Space size:
1951 points
COMBINATIONS
GENERATOR
MDCTOOL
LOW-LEVEL
FEEDBACKANALYSIS
PARETOANALYSIS
No impact of MDC on the CP of the system
•BEST AREA/POWER SOLUTION: all merged (a1p1)
•BEST FREQUENCY SOLUTION: all merged (a1p1)
48. ISPA2013–September4th-6th–Trieste,Italy
UC2 - Pareto analysis
UC2: Zoom
COMBINATIONS
GENERATOR
MDCTOOL
LOW-LEVEL
FEEDBACKANALYSIS
PARETOANALYSIS
Involved SADs:
7
Design Space size:
13693 points
-23% -14%
+2.4% area
+1.8% power
49. ISPA2013–September4th-6th–Trieste,Italy
UC2 - Pareto analysis
UC2: Zoom
COMBINATIONS
GENERATOR
MDCTOOL
LOW-LEVEL
FEEDBACKANALYSIS
PARETOANALYSIS
Involved SADs:
7
Design Space size:
13693 points
-23% -14%
+2.4% area
+1.8% power
MDC impacts on the CP of the system
•BEST AREA/POWER SOLUTION: all merged (z3p1,z3p2)
•BEST FREQUENCY SOLUTION: hybrid, 6 merged SADs (z3p3)
50. ISPA2013–September4th-6th–Trieste,Italy
SADs UC1 UC2 UC3
Qsort 2,94% - 1,89%
Min_Max 2,94% 3,12% 1,89%
Corr 11,76% - 9,43%
Abs 2,94% 3,12% 1,89%
Rgb2Ycc 20,59% - 15,09%
Ycc2Rgb 26,47% - 18,87%
Sbwlabel - 15,62% 11,32%
Median - 18,75% 13,21%
Cubic - 18,75% 11,32%
Cubic_Conv - 9,38% 5,66%
Check_Gener
alBilevel
- 18,75% 11,32%
Table: percentage of the SAD overlapping
actors with respect to an all merged solution.
UC3 - Pareto analysis
UC3: Antialiasing
& Zoom
COMBINATIONS
GENERATOR
MDCTOOL
LOW-LEVEL
FEEDBACKANALYSIS
PARETOANALYSIS
Involved SADs:
11
Design Space size:
108505101 points
51. ISPA2013–September4th-6th–Trieste,Italy
SADs UC1 UC2 UC3
Qsort 2,94% - 1,89%
Min_Max 2,94% 3,12% 1,89%
Corr 11,76% - 9,43%
Abs 2,94% 3,12% 1,89%
Rgb2Ycc 20,59% - 15,09%
Ycc2Rgb 26,47% - 18,87%
Sbwlabel - 15,62% 11,32%
Median - 18,75% 13,21%
Cubic - 18,75% 11,32%
Cubic_Conv - 9,38% 5,66%
Check_Gener
alBilevel
- 18,75% 11,32%
Table: percentage of the SAD overlapping
actors with respect to an all merged solution.
UC3 - Pareto analysis
UC3: Antialiasing
& Zoom
COMBINATIONS
GENERATOR
MDCTOOL
LOW-LEVEL
FEEDBACKANALYSIS
PARETOANALYSIS
Involved SADs:
11
Design Space size:
108505101 points
55. ISPA2013–September4th-6th–Trieste,Italy
UC3 - Pareto analysis
UC3: Antialiasing
& Zoom
COMBINATIONS
GENERATOR
MDCTOOL
LOW-LEVEL
FEEDBACKANALYSIS
PARETOANALYSIS
Involved SADs:
6 of 11
Design Subpace size:
1951 points
Need of a heuristic if we have many SADs
•BEST AREA/POWER SOLUTION: heuristic only on all merged
•BEST FREQUENCY SOLUTION: affine proposed heuristic
56. ISPA2013–September4th-6th–Trieste,Italy
Final Remarks
• A DSE and a profiling methodology specifically conceived to
guide designers targeting multi-context architectures has
been presented:
– The DSE and profiling are performed at a high level of
abstraction, but back annotated low-level information are
considered.
• Synthesis trials confirmed the applicability of the proposed
methodology with an estimation error lower (on average)
than the 7%.
– Accuracy can be improved adopting more accurate
parameterized models to compute the metrics of interest.
• Future developments will regard
– Accuracy improvement
– Introduction of automated clever heuristics, based on high-level
information, to limit the dimension of the design space.
57. ISPA2013–September4th-6th–Trieste,Italy
Acknowledgements
The research leading to these results has received funding from:
the Region of Sardinia L.R.7/2007
under grant agreement CRP-18324
[RPCT Project]
the Region of Sardinia, Young
Researchers Grant, POR Sardegna FSE
2007-2013, L.R.7/2007 “Promotion of
the scientific research and
technological innovation in Sardinia”
58. 8th International Symposium on Image and Signal Processing and Analysis - 2013
September 4th-6th, 2013,
Trieste, Italy
Special Session on Hardware-software Co-design Methodologies for Streaming
Processing in Digital Media Technologies
DSE and Profiling of Multi-Context
Coarse-Grained
Reconfigurable Systems
Carlo Sau, Ph.D student.
carlo.sau@diee.unica.it
EOLAB - Microelectronics Lab
Dept. of Electrical and Electronics Eng.
University of Cagliari (ITALY)