An Open Discussion of RISC-V BitManip, trends, and comparisons _ ClaireRISC-V International
Join RISC-V BitManip industry leader Claire Xenia Wolf and Dr. James Cuff for an open and lively discussion with an interactive Q&A on RISC-V and BitManip including trends and comparisons with the existing architecture landscape including x86 and ARM and what specifically makes RISC-V unique.
eBPF Debugging Infrastructure - Current TechniquesNetronome
eBPF (extended Berkeley Packet Filter), in particular with its driver-level hook XDP (eXpress Data Path), has increased in importance over the past few years. As a result, the ability to rapidly debug and diagnose problems is becoming more relevant. This talk will cover common issues faced and techniques to diagnose them, including the use of bpftool for map and program introspection, the use of disassembly to inspect generated assembly code and other methods such as using debug prints and how to apply these techniques when eBPF programs are offloaded to the hardware.
The talk will also explore where the current gaps in debugging infrastructure are and suggest some of the next steps to improve this, for example, integrations with tools such as strace, valgrind or even the LLDB debugger.
An Open Discussion of RISC-V BitManip, trends, and comparisons _ ClaireRISC-V International
Join RISC-V BitManip industry leader Claire Xenia Wolf and Dr. James Cuff for an open and lively discussion with an interactive Q&A on RISC-V and BitManip including trends and comparisons with the existing architecture landscape including x86 and ARM and what specifically makes RISC-V unique.
eBPF Debugging Infrastructure - Current TechniquesNetronome
eBPF (extended Berkeley Packet Filter), in particular with its driver-level hook XDP (eXpress Data Path), has increased in importance over the past few years. As a result, the ability to rapidly debug and diagnose problems is becoming more relevant. This talk will cover common issues faced and techniques to diagnose them, including the use of bpftool for map and program introspection, the use of disassembly to inspect generated assembly code and other methods such as using debug prints and how to apply these techniques when eBPF programs are offloaded to the hardware.
The talk will also explore where the current gaps in debugging infrastructure are and suggest some of the next steps to improve this, for example, integrations with tools such as strace, valgrind or even the LLDB debugger.
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...AMD Developer Central
Keynote presentation, Is There Anything New in Heterogeneous Computing, by Mike Muller, Chief Technology Officer, ARM, at the AMD Developer Summit (APU13), Nov. 11-13, 2013.
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMULinaro
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
Speaker: Alex Bennée
Date: September 22, 2015
★ Session Description ★
While QEMU has continued to be optimised for KVM to make use of the growing number of cores on modern systems, TCG emulation has been stuck running in a single thread. This year there is another push to get a workable solution merged upstream. We shall present a review of the challenges that need to be addressed: locking, TLB and cache maintenance and generic solution for the various atomic/exclusive operations. We will discuss previous work that has been done in this field before presenting a design that addresses these requirements. Finally we shall look at the current proposed patches and the design decisions they have taken.
★ Resources ★
Video: https://www.youtube.com/watch?v=9xQGDTEmNtI
Presentation: http://www.slideshare.net/linaroorg/sfo15202-towards-multithreaded-tiny-code-generator-tcg-in-qemu
Etherpad: pad.linaro.org/p/sfo15-202
Pathable: https://sfo15.pathable.com/meetings/302833
★ Event Details ★
Linaro Connect San Francisco 2015 - #SFO15
September 21-25, 2015
Hyatt Regency Hotel
http://www.linaro.org
http://connect.linaro.org
PyCoRAM: Yet Another Implementation of CoRAM Memory Architecture for Modern F...Shinya Takamaeda-Y
Presentation slide for CARL2013 (Co-located with MICRO-46) at Davis, CA.
PyCoRAM: Yet Another Implementation of CoRAM Memory Architecture for Modern FPGA-based Computing
A Framework for Efficient Rapid Prototyping by Virtually Enlarging FPGA Resou...Shinya Takamaeda-Y
A Framework for Efficient Rapid Prototyping by Virtually Enlarging FPGA Resources (ReConFig2014@Cancun, Mexico)
flipSyrup, a new framework for rapid prototyping is proposed.
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...J On The Beach
Do you want to check the efficiency of the new, state of the art, GraalVM JIT Compiler in comparison to the old but mostly used JIT C2? Let’s have a side by side comparison from a performance standpoint on the same source code.
The talk reveals how traditional Just In Time Compiler (e.g. JIT C2) from HotSpot/OpenJDK internally manages runtime optimizations for hot methods in comparison to the new, state of the art, GraalVM JIT Compiler on the same source code, emphasizing all of the internals and strategies used by each Compiler to achieve better performance in most common situations (or code patterns). For each optimization, there is Java source code and corresponding generated assembly code in order to prove what really happens under the hood.
Each test is covered by a dedicated benchmark (JMH), timings and conclusions. Main topics of the agenda: - Scalar replacement - Null Checks - Virtual calls - Lock coarsening - Lock elision - Virtual calls - Scalar replacement - Lambdas - Vectorization (few cases)
The tools used during my research study are JITWatch, Java Measurement Harness, and perf. All test scenarios will be launched against the latest official Java release (e.g. version 11).
Getting Started with Raspberry Pi - USC 2013Tom Paulus
The Raspberry Pi is a small credit-card sized linux computer. Developers and hobbyists around the world are creating miraculous applications and projects, and now you can join them. This presentation covers the first steps to using your Pi. From the basics, like burning your SD Card to creating a Weather Reporter. Discussing GPIO Basics and simple Python tools. Communication between other components using SPI or I2C is also covered.
eBPF has 64-bit general purpose registers, therefore 32-bit architectures normally need to use register pair to model them and need to generate extra instructions to manipulate the high 32-bit in the pair. Some of these overheads incurred could be eliminated if JIT compiler knows only the low 32-bit of a register is interested. This could be known through data flow (DF) analysis techniques. Either the classic iterative DF analysis or "path-sensitive" version based on verifier's code path walker.
In this talk, implementations for both versions of DF analyzer will be presented. We will see how a def-use chain based classic eBPF DF analyser looks first, and will see the possibility to integrate it with previous proposed eBPF control flow graph framework to make a stand-alone eBPF global DF analyser which could potentially serve as a library. Then, another "path-sensitive" DF analyser based on the existing verifier code path walker will be presented. We will discuss how function calls, path prune, path switch affect the implementation. Finally, we will summarize pros and cons for each, and will see how could each of them be adapted to 64-bit and 32-bit architecture back-ends.
Also, eBPF has 32-bit sub-register and ALU32 instructions associated, enable them (-mattr=+alu32) in LLVM code-gen could let the generated eBPF sequences carry more 32-bit information which could potentially easy flow analyser. This will be briefly discussed in the talk as well.
Arm tools and roadmap for SVE compiler supportLinaro
By Richard Sandiford, Florian Hahn (Arm), ARM
This presentation will give an overview of what Arm is doing to develop the HPC ecosystem, with a particular focus on SVE. It will include a brief synopsis of both the commercial and open-source tools and libraries that Arm is developing and a description of the various community initiatives that Arm is involved in. The bulk of the talk will describe the roadmap for SVE compiler support in both GCC and LLVM. It will cover the work that has already been done to support both hand-optimised and automatically-vectorised code, and the plans for future improvements.
For more info on The Linaro High Performance Computing (HPC) visit https://www.linaro.org/sig/hpc/
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesMarina Kolpakova
The slides give an idea about how to look pragmatically at software optimization and order optimization approaches according to this pragmatic point of view
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...AMD Developer Central
Keynote presentation, Is There Anything New in Heterogeneous Computing, by Mike Muller, Chief Technology Officer, ARM, at the AMD Developer Summit (APU13), Nov. 11-13, 2013.
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMULinaro
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
Speaker: Alex Bennée
Date: September 22, 2015
★ Session Description ★
While QEMU has continued to be optimised for KVM to make use of the growing number of cores on modern systems, TCG emulation has been stuck running in a single thread. This year there is another push to get a workable solution merged upstream. We shall present a review of the challenges that need to be addressed: locking, TLB and cache maintenance and generic solution for the various atomic/exclusive operations. We will discuss previous work that has been done in this field before presenting a design that addresses these requirements. Finally we shall look at the current proposed patches and the design decisions they have taken.
★ Resources ★
Video: https://www.youtube.com/watch?v=9xQGDTEmNtI
Presentation: http://www.slideshare.net/linaroorg/sfo15202-towards-multithreaded-tiny-code-generator-tcg-in-qemu
Etherpad: pad.linaro.org/p/sfo15-202
Pathable: https://sfo15.pathable.com/meetings/302833
★ Event Details ★
Linaro Connect San Francisco 2015 - #SFO15
September 21-25, 2015
Hyatt Regency Hotel
http://www.linaro.org
http://connect.linaro.org
PyCoRAM: Yet Another Implementation of CoRAM Memory Architecture for Modern F...Shinya Takamaeda-Y
Presentation slide for CARL2013 (Co-located with MICRO-46) at Davis, CA.
PyCoRAM: Yet Another Implementation of CoRAM Memory Architecture for Modern FPGA-based Computing
A Framework for Efficient Rapid Prototyping by Virtually Enlarging FPGA Resou...Shinya Takamaeda-Y
A Framework for Efficient Rapid Prototyping by Virtually Enlarging FPGA Resources (ReConFig2014@Cancun, Mexico)
flipSyrup, a new framework for rapid prototyping is proposed.
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...J On The Beach
Do you want to check the efficiency of the new, state of the art, GraalVM JIT Compiler in comparison to the old but mostly used JIT C2? Let’s have a side by side comparison from a performance standpoint on the same source code.
The talk reveals how traditional Just In Time Compiler (e.g. JIT C2) from HotSpot/OpenJDK internally manages runtime optimizations for hot methods in comparison to the new, state of the art, GraalVM JIT Compiler on the same source code, emphasizing all of the internals and strategies used by each Compiler to achieve better performance in most common situations (or code patterns). For each optimization, there is Java source code and corresponding generated assembly code in order to prove what really happens under the hood.
Each test is covered by a dedicated benchmark (JMH), timings and conclusions. Main topics of the agenda: - Scalar replacement - Null Checks - Virtual calls - Lock coarsening - Lock elision - Virtual calls - Scalar replacement - Lambdas - Vectorization (few cases)
The tools used during my research study are JITWatch, Java Measurement Harness, and perf. All test scenarios will be launched against the latest official Java release (e.g. version 11).
Getting Started with Raspberry Pi - USC 2013Tom Paulus
The Raspberry Pi is a small credit-card sized linux computer. Developers and hobbyists around the world are creating miraculous applications and projects, and now you can join them. This presentation covers the first steps to using your Pi. From the basics, like burning your SD Card to creating a Weather Reporter. Discussing GPIO Basics and simple Python tools. Communication between other components using SPI or I2C is also covered.
eBPF has 64-bit general purpose registers, therefore 32-bit architectures normally need to use register pair to model them and need to generate extra instructions to manipulate the high 32-bit in the pair. Some of these overheads incurred could be eliminated if JIT compiler knows only the low 32-bit of a register is interested. This could be known through data flow (DF) analysis techniques. Either the classic iterative DF analysis or "path-sensitive" version based on verifier's code path walker.
In this talk, implementations for both versions of DF analyzer will be presented. We will see how a def-use chain based classic eBPF DF analyser looks first, and will see the possibility to integrate it with previous proposed eBPF control flow graph framework to make a stand-alone eBPF global DF analyser which could potentially serve as a library. Then, another "path-sensitive" DF analyser based on the existing verifier code path walker will be presented. We will discuss how function calls, path prune, path switch affect the implementation. Finally, we will summarize pros and cons for each, and will see how could each of them be adapted to 64-bit and 32-bit architecture back-ends.
Also, eBPF has 32-bit sub-register and ALU32 instructions associated, enable them (-mattr=+alu32) in LLVM code-gen could let the generated eBPF sequences carry more 32-bit information which could potentially easy flow analyser. This will be briefly discussed in the talk as well.
Arm tools and roadmap for SVE compiler supportLinaro
By Richard Sandiford, Florian Hahn (Arm), ARM
This presentation will give an overview of what Arm is doing to develop the HPC ecosystem, with a particular focus on SVE. It will include a brief synopsis of both the commercial and open-source tools and libraries that Arm is developing and a description of the various community initiatives that Arm is involved in. The bulk of the talk will describe the roadmap for SVE compiler support in both GCC and LLVM. It will cover the work that has already been done to support both hand-optimised and automatically-vectorised code, and the plans for future improvements.
For more info on The Linaro High Performance Computing (HPC) visit https://www.linaro.org/sig/hpc/
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesMarina Kolpakova
The slides give an idea about how to look pragmatically at software optimization and order optimization approaches according to this pragmatic point of view
Show Me the Outcomes!
Evaluating and Proving Your Impact on the Community
Learn how to:
1. Understand how to build a successful outcomes plan for your nonprofit organization
2. Increase your funding by proving your program success to your funders
3. Make informed decisions about future programming and resource allocation
You will also receive an inside view of the Apricot Outcomes Palette™, a dynamic outcomes reporting tool
Presented by:
Kathryn Engelhardt-Cronk
Founder/CEO/President
Community TechKnowledge, Inc.
Low-cost microcontrollers are being used more and more often in embedded applications that previously may have used a microprocessor. Microcontrollers often run a real-time operating system (RTOS) rather than a full operating system like Linux. In this webinar we introduce FreeRTOS, a popular RTOS for microcontrollers that has been ported to 35 microcontroller platforms.
Introduction to binary translation in QEMU(TCG). Describe how it works. In addition, there is a section which demonstrate qemu-monitor, a debug tool for AArch64/QEMU.
There are lots of animations in the slides so download and open it with Microsoft PowerPoint for the best experience. Below is the download link.
Google Driver Link: http://goo.gl/XXMC9X
FPGA based 10G Performance Tester for HW OpenFlow SwitchYutaka Yasuda
SDN operators need to measure the performance of OF HW switch on their site. Cause there is 1000 times differences in latency, depends on the specified flow entry. ASIC can forward in several μsecs but the software (CPU) may take msec.
To protect yourself from unexpected performance plunge, monitor your switches healthiness on your site.
A novel approach to Artificial Intelligence On-Board
New generations of spacecrafts are required to perform tasks with an increased level of autonomy. Space exploration, Earth Observation, space robotics, etc. are all growing fields in Space that require more sensors and more computational power to perform these missions.
Sensors, embedded processors, and hardware in general have hugely evolved in the last decade, equipping embedded systems with large number of sensors that will produce data at rates that has not been seen before while simultaneously having computing power capable of large data processing on-board. Near-future spacecrafts will be equipped with large number of sensors that will produce data at high-speed rates in space and data processing power will be significantly increased.
Future missions such as Active Debris Removal will rely on novel high-performance avionics to support image processing and Artificial Intelligence algorithms with large workloads. Similar requirements come from Earth Observation applications, where data processing on-board can be critical in order to provide real-time reliable information to Earth. This new scenario has brought new challenges with it: low determinism, excessive power needs, data losses and large response latency.
In this project, Klepsydra AI is used as a novel approach to on-board artificial intelligence. It provides a very sophisticated threading model combination of pipeline and parallelization techniques applied to deep neural networks, making AI applications much more efficient and reliable. This new approach has been validated with several DNN models and two different computer architectures. The results show that the data processing rate and power saving of the applications increase substantially with respect to standard AI solutions.
1.Gives basic idea about what is arduino? and their funtionalites.
2. Applications of arduino
3. Adruino programming
4. what is Nodemcu ?
5. pindiagram of Nodemcu
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Intel® Software
Integrated into Intel® Advisor, Cache-aware Roofline Modeling (CARM) provides insight into how an application behaves by helping to determine a) how optimally it works on a given hardware, b) the main factors that limit performance, c) if the workload is memory or compute-bound, and d) the right strategy to improve application performance.
Embedded Recipes 2019 - Introduction to JTAG debuggingAnne Nicolas
This talk introduces JTAG debugging capabilities, both for debugging hardware and software. Marek first explains what the JTAG stands for and explains the operation of the JTAG state machine. This is followed by an introduction to free software JTAG tools, OpenOCD and urJTAG. Marek shortly explains how to debug software using those tools and how that ties into the JTAG state machine. However, JTAG was designed for testing hardware. Marek explains what boundary scan testing (BST) is, what are BSDL files and their format, and practically demonstrates how to blink an LED using BST and only free software tools.
Marek Vasut
19. WiLab @ IBBT: Environment Emulator DUT can be a sensor, but also an actuator! Environment Emulators can be put in cascade DUT Environment Emulator PWR SI I(O) Power DAC/ADC, I 2 C, … GP(I)O USB Ethernet + power Fixed interface: RS232, USB, … iNode Current DUT Environment Emulator PWR SI I(O) Power DAC/ADC, I 2 C, … GP(I)O Fixed interface: RS232, USB, … Current
45. Software iPlatform : General setup p. Partition 1 SANET jobs Partition 2 iNode jobs /tmp/log on iNode automaticly saved to log/job_run/nodeId on NFS share
46.
47. Software iPlatform: Files needed for experiment p. Program files start_mount_code: This script is automatically started after booting. You can start your own software by calling it in this script vmlinuz: Custom linux kernel initrd.img: Custom drivers log directory: Automatically created, contains content of /tmp/log directory of iNodes
48. Software iPlatform : concept p. Inode 1 Inode 2 Inode 3 Inode 4 Master Slaves 1.2.3.4:/master_code/ 1.2.3.4:/slave_code/ An iPlatform defines for all iNodes in w-iLab.t which user code will run on which iNode Inode 5 Inode 6 Inode 7 Not used