In this slide, I introduce how I implement RSA256 algorithm with verilog and verify with verilator.
The project use C++ to build the C-model and SystemC model.
To help build the model, we create a C++ class vint to simulate the behavior of Verilog. It supports normal Verilog operation with more strict rules.
The systemC model can be directly translated into Verilog, so the intention of Verilog design is quite clear and concise.
To simplify the simulation, we limit our module to be one input port and one output port. The port uses the valid/ready protocol to control the data flow, which can be modeled as sc_fifo in systemC.
With these abstraction, we can easily implement unit test for all of our modules, and make sure they act as what we want.
----
Please access the source code at:
https://github.com/yodalee/rsa256
C++20 comes with some big new language features: modules, coroutines, concepts, spaceship, and many new libraries. But apart from all those, C++20 also offers many small language improvements, making C++ more powerful and expressive, but also safer and more consistent. This talk is an overview over all those smaller additions to the core language that will make your life easier. We will discuss much-needed improvements to existing facilities such as lambdas, CTAD, structured bindings, and initialisation, as well as brand-new language utilities that you may not yet have heard about!
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docxeugeniadean34240
20145-5SumII_CSC407_assign1.html
CSC 407: Computer Systems II: 2015 Summer II, Assignment #1
Last Modified 2015 July 21Purpose:
To go over issues related to how the compiler and the linker
serve you, the programmer.
Computing
Please ssh into ctilinux1.cstcis.cti.depaul.edu, or use your own Linux machine.
Compiler optimization (45 Points)
Consider the following program.
/* q1.c
*/
#include <stdlib.h>
#include <stdio.h>
#define unsigned int uint
#define LENGTH ((uint) 512*64)
int initializeArray (uint len,
int* intArray
)
{
uint i;
for (i = 0; i < len; i++)
intArray[i] = (rand() % 64);
}
uint countAdjacent (int maxIndex,
int* intArray,
int direction
)
{
uint i;
uint sum = 0;
for (i = 0; i < maxIndex; i++)
if ( ( intArray[i] == (intArray[i+1] + direction) ) &&
( intArray[i] == (intArray[i+2] + 2*direction) )
)
sum++;
return(sum);
}
uint funkyFunction (uint len,
int* intArray
)
{
uint i;
uint sum = 0;
for (i = 0; i < len-1; i++)
if ( (i % 8) == 0x3 )
sum += 7*countAdjacent(len-2,intArray,+1);
else
sum += 17*countAdjacent(len-2,intArray,-1);
return(sum);
}
int main ()
{
int* intArray = (int*)calloc(LENGTH,sizeof(int));
initializeArray(LENGTH,intArray);
printf("funkyFunction() == %d\n",funkyFunction(LENGTH,intArray));
free(intArray);
return(EXIT_SUCCESS);
}
(8 Points) Compile it for profiling but with no extra optimization with:
$ gcc -o q1None -pg q1.c # Compiles q1.c to write q1None to make profile info
$ ./q1None # Runs q1None
$ gprof q1None # Gives profile info on q1None
Be sure to scroll all the way to the top of gprof output!
What are the number of self seconds taken by:
FunctionSelf secondsinitializeBigArray()__________countAdjaceent()__________funkyFunction()__________
(8 Points)
How did it do the operation (i % 8) == 0x3?
Was it done as a modulus (the same as an expensive division, but returns the remainder instead of the quotient) or something else?
Show the assembly language for this C code
using gdb to dissassemble
funkyFunction() of q1None.
Hint: do:
$ gdb q1None
. . .
(gdb) disass funkyFunction
Dump of assembler code for function funkyFunction:
. . .
and then look for the code that sets up the calls to countAdjacent().
The (i % 8) == 0x3 test is done before either countAdjacent() call.
(8 Points) Compile it for profiling but with optimization with:
$ gcc -o q1Compiler -O1 -pg q1.c # Compiles q1.c to write q1Compiler to make profile info
$ ./q1Compiler # Runs q1Compiler
$ gprof q1Compiler # Gives profile info on q1Compiler
What are the number of self seconds taken by:
FunctionSelf secondsinitializeBigArray()__________countAdjacent()__________funkyFunction()__________(8 Points) Use gdb to dissassemble countAdjacent() of both q1None and q1.
C++20 comes with some big new language features: modules, coroutines, concepts, spaceship, and many new libraries. But apart from all those, C++20 also offers many small language improvements, making C++ more powerful and expressive, but also safer and more consistent. This talk is an overview over all those smaller additions to the core language that will make your life easier. We will discuss much-needed improvements to existing facilities such as lambdas, CTAD, structured bindings, and initialisation, as well as brand-new language utilities that you may not yet have heard about!
20145-5SumII_CSC407_assign1.htmlCSC 407 Computer Systems II.docxeugeniadean34240
20145-5SumII_CSC407_assign1.html
CSC 407: Computer Systems II: 2015 Summer II, Assignment #1
Last Modified 2015 July 21Purpose:
To go over issues related to how the compiler and the linker
serve you, the programmer.
Computing
Please ssh into ctilinux1.cstcis.cti.depaul.edu, or use your own Linux machine.
Compiler optimization (45 Points)
Consider the following program.
/* q1.c
*/
#include <stdlib.h>
#include <stdio.h>
#define unsigned int uint
#define LENGTH ((uint) 512*64)
int initializeArray (uint len,
int* intArray
)
{
uint i;
for (i = 0; i < len; i++)
intArray[i] = (rand() % 64);
}
uint countAdjacent (int maxIndex,
int* intArray,
int direction
)
{
uint i;
uint sum = 0;
for (i = 0; i < maxIndex; i++)
if ( ( intArray[i] == (intArray[i+1] + direction) ) &&
( intArray[i] == (intArray[i+2] + 2*direction) )
)
sum++;
return(sum);
}
uint funkyFunction (uint len,
int* intArray
)
{
uint i;
uint sum = 0;
for (i = 0; i < len-1; i++)
if ( (i % 8) == 0x3 )
sum += 7*countAdjacent(len-2,intArray,+1);
else
sum += 17*countAdjacent(len-2,intArray,-1);
return(sum);
}
int main ()
{
int* intArray = (int*)calloc(LENGTH,sizeof(int));
initializeArray(LENGTH,intArray);
printf("funkyFunction() == %d\n",funkyFunction(LENGTH,intArray));
free(intArray);
return(EXIT_SUCCESS);
}
(8 Points) Compile it for profiling but with no extra optimization with:
$ gcc -o q1None -pg q1.c # Compiles q1.c to write q1None to make profile info
$ ./q1None # Runs q1None
$ gprof q1None # Gives profile info on q1None
Be sure to scroll all the way to the top of gprof output!
What are the number of self seconds taken by:
FunctionSelf secondsinitializeBigArray()__________countAdjaceent()__________funkyFunction()__________
(8 Points)
How did it do the operation (i % 8) == 0x3?
Was it done as a modulus (the same as an expensive division, but returns the remainder instead of the quotient) or something else?
Show the assembly language for this C code
using gdb to dissassemble
funkyFunction() of q1None.
Hint: do:
$ gdb q1None
. . .
(gdb) disass funkyFunction
Dump of assembler code for function funkyFunction:
. . .
and then look for the code that sets up the calls to countAdjacent().
The (i % 8) == 0x3 test is done before either countAdjacent() call.
(8 Points) Compile it for profiling but with optimization with:
$ gcc -o q1Compiler -O1 -pg q1.c # Compiles q1.c to write q1Compiler to make profile info
$ ./q1Compiler # Runs q1Compiler
$ gprof q1Compiler # Gives profile info on q1Compiler
What are the number of self seconds taken by:
FunctionSelf secondsinitializeBigArray()__________countAdjacent()__________funkyFunction()__________(8 Points) Use gdb to dissassemble countAdjacent() of both q1None and q1.
C++ Code as Seen by a Hypercritical ReviewerAndrey Karpov
We all do code reviews. Who doesn't admit this – does it twice as often. C++ code reviewers look like a sapper. .. except that they can make a mistake more than once. But sometimes the consequences are painful . Brave code review world.
The GlobalISel framework was introduced with the intention of replacing SelectionDAG, aiming to provide advantages in terms of performance, granularity, and modularity. This tutorial will provide everything you need to know about using this framework for a new target, focusing on RISC-V as an example and working through some specific examples of challenging cases.
(c) European LLVM Developers' Meeting 2023
Glasgow, United Kingdom
May 10 - 11, 2023
https://llvm.swoogo.com/2023eurollvm/
https://www.youtube.com/playlist?list=PL_R5A0lGi1AD-bqRaY61l5Q-EozbfyLZr
JVM Mechanics: When Does the JVM JIT & Deoptimize?Doug Hawkins
HotSpot promises to do the "right" thing for us by identifying our hot code and compiling "just-in-time", but how does HotSpot make those decisions?
This presentation aims to detail how HotSpot makes those decisions and how it corrects its mistakes through a series of demos that you run yourself.
CUDA by Example : Parallel Programming in CUDA C : NotesSubhajit Sahu
Highlighted notes of:
Chapter 4: Parallel Programming in CUDA C
Book:
CUDA by Example
An Introduction to General Purpose GPU Computing
Authors:
Jason Sanders
Edward Kandrot
“This book is required reading for anyone working with accelerator-based computing systems.”
–From the Foreword by Jack Dongarra, University of Tennessee and Oak Ridge National Laboratory
CUDA is a computing architecture designed to facilitate the development of parallel programs. In conjunction with a comprehensive software platform, the CUDA Architecture enables programmers to draw on the immense power of graphics processing units (GPUs) when building high-performance applications. GPUs, of course, have long been available for demanding graphics and game applications. CUDA now brings this valuable resource to programmers working on applications in other domains, including science, engineering, and finance. No knowledge of graphics programming is required–just the ability to program in a modestly extended version of C.
CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. The authors introduce each area of CUDA development through working examples. After a concise introduction to the CUDA platform and architecture, as well as a quick-start guide to CUDA C, the book details the techniques and trade-offs associated with each key CUDA feature. You’ll discover when to use each CUDA C extension and how to write CUDA software that delivers truly outstanding performance.
Table of Contents
Why CUDA? Why Now?
Getting Started
Introduction to CUDA C
Parallel Programming in CUDA C
Thread Cooperation
Constant Memory and Events
Texture Memory
Graphics Interoperability
Atomics
Streams
CUDA C on Multiple GPUs
The Final Countdown
All the CUDA software tools you’ll need are freely available for download from NVIDIA.
Jason Sanders is a senior software engineer in NVIDIA’s CUDA Platform Group, helped develop early releases of CUDA system software and contributed to the OpenCL 1.0 Specification, an industry standard for heterogeneous computing. He has held positions at ATI Technologies, Apple, and Novell.
Edward Kandrot is a senior software engineer on NVIDIA’s CUDA Algorithms team, has more than twenty years of industry experience optimizing code performance for firms including Adobe, Microsoft, Google, and Autodesk.
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015Windows Developer
Visual Studio 2015 provides the best in class C++ development experience whether you are targeting Android, iOS, Linux, Windows, or IoT. With a good mix of demos and showcase for new C++ experiences, this talk goes over six great reasons why you should migrate to Visual Studio 2015 today.
"Optimization of a .NET application- is it simple ! / ?", Yevhen TatarynovFwdays
Optimization of .NET application seems complex and tied full task, but don’t hurry up with conclusions. Let’s look on several cases from real projects.
For this we:
look under the hood of an application from a real project;
define the metric for optimization;
choose the necessary tools;
find bottlenecks /memory leaks and best practice to resolve them.
We'll improve the application step by step and we’ll what with simple analysis and simple best practice we can significantly reduce total resources usage.
9 грудня відбувся вебінар “Why Should You Learn C++ in 2021-22?”
Розглянули, наскільки популярною є C/C++ і де її можна використовувати. Поговорили про основні переваги та недоліки цієї мови програмування. Розповіли, як розвивається C/C++ і, нарешті, ми зрозуміли, як почати вивчати C/C++.
Більше про захід: https://www.globallogic.com/ua/about/events/c-webinar-why-you-should-learn-c-in-2021-22/
Приємного перегляду і не забудьте залишити коментар про враження від вебінару!
1- Modeling Hierarchy
2- Creating Testbenches
Skills gained:
1- Reuse design units several times in a design hierarchy
2- Automate testing of design units
This is part of VHDL 360 course
Highlighted notes of:
Introduction to CUDA C: NVIDIA
Author: Blaise Barney
From: GPU Clusters, Lawrence Livermore National Laboratory
https://computing.llnl.gov/tutorials/linux_clusters/gpu/NVIDIA.Introduction_to_CUDA_C.1.pdf
Blaise Barney is a research scientist at Lawrence Livermore National Laboratory.
Applying Compiler Techniques to Iterate At Blazing SpeedPascal-Louis Perez
In this session, we will present real life applications of compiler techniques helping kaChing achieve ultra confidence and power its incredible 5 minutes commit-to-production cycle [1]. We'll talk about idempotency analysis [2], dependency detection, on the fly optimisations, automatic memoization [3], type unification [4] and more! This talk is not suitable for the faint-hearted... If you want to dive deep, learn about advanced JVM topics, devoure bytecode and see first hand applications of theoretical computer science, join us.
[1] http://eng.kaching.com/2010/05/deployment-infrastructure-for.html
[2] http://en.wikipedia.org/wiki/Idempotence
[3] http://en.wikipedia.org/wiki/Memoization
[4] http://eng.kaching.com/2009/10/unifying-type-parameters-in-java.html
The slide introduce some of the Rust concept that are necessary to write a kernel. Including wrapping an CSRs operation, locking mutable static variable, memory allocator, and pointer in Rust.
Please visit the project github to see the source code of the rrxv6 projects:
https://github.com/yodalee/rrxv6
In this slide, I introduced how Gameboy works and how to build a Gameboy emulator using Rust programming language. Also, I introduce how to migrate the Rust emulator to Webassembly, so that we can run the emulator using browser.
Video of presentation of this slide:
https://www.youtube.com/watch?v=LqcEg3IVziQ
C++ Code as Seen by a Hypercritical ReviewerAndrey Karpov
We all do code reviews. Who doesn't admit this – does it twice as often. C++ code reviewers look like a sapper. .. except that they can make a mistake more than once. But sometimes the consequences are painful . Brave code review world.
The GlobalISel framework was introduced with the intention of replacing SelectionDAG, aiming to provide advantages in terms of performance, granularity, and modularity. This tutorial will provide everything you need to know about using this framework for a new target, focusing on RISC-V as an example and working through some specific examples of challenging cases.
(c) European LLVM Developers' Meeting 2023
Glasgow, United Kingdom
May 10 - 11, 2023
https://llvm.swoogo.com/2023eurollvm/
https://www.youtube.com/playlist?list=PL_R5A0lGi1AD-bqRaY61l5Q-EozbfyLZr
JVM Mechanics: When Does the JVM JIT & Deoptimize?Doug Hawkins
HotSpot promises to do the "right" thing for us by identifying our hot code and compiling "just-in-time", but how does HotSpot make those decisions?
This presentation aims to detail how HotSpot makes those decisions and how it corrects its mistakes through a series of demos that you run yourself.
CUDA by Example : Parallel Programming in CUDA C : NotesSubhajit Sahu
Highlighted notes of:
Chapter 4: Parallel Programming in CUDA C
Book:
CUDA by Example
An Introduction to General Purpose GPU Computing
Authors:
Jason Sanders
Edward Kandrot
“This book is required reading for anyone working with accelerator-based computing systems.”
–From the Foreword by Jack Dongarra, University of Tennessee and Oak Ridge National Laboratory
CUDA is a computing architecture designed to facilitate the development of parallel programs. In conjunction with a comprehensive software platform, the CUDA Architecture enables programmers to draw on the immense power of graphics processing units (GPUs) when building high-performance applications. GPUs, of course, have long been available for demanding graphics and game applications. CUDA now brings this valuable resource to programmers working on applications in other domains, including science, engineering, and finance. No knowledge of graphics programming is required–just the ability to program in a modestly extended version of C.
CUDA by Example, written by two senior members of the CUDA software platform team, shows programmers how to employ this new technology. The authors introduce each area of CUDA development through working examples. After a concise introduction to the CUDA platform and architecture, as well as a quick-start guide to CUDA C, the book details the techniques and trade-offs associated with each key CUDA feature. You’ll discover when to use each CUDA C extension and how to write CUDA software that delivers truly outstanding performance.
Table of Contents
Why CUDA? Why Now?
Getting Started
Introduction to CUDA C
Parallel Programming in CUDA C
Thread Cooperation
Constant Memory and Events
Texture Memory
Graphics Interoperability
Atomics
Streams
CUDA C on Multiple GPUs
The Final Countdown
All the CUDA software tools you’ll need are freely available for download from NVIDIA.
Jason Sanders is a senior software engineer in NVIDIA’s CUDA Platform Group, helped develop early releases of CUDA system software and contributed to the OpenCL 1.0 Specification, an industry standard for heterogeneous computing. He has held positions at ATI Technologies, Apple, and Novell.
Edward Kandrot is a senior software engineer on NVIDIA’s CUDA Algorithms team, has more than twenty years of industry experience optimizing code performance for firms including Adobe, Microsoft, Google, and Autodesk.
Build 2016 - B880 - Top 6 Reasons to Move Your C++ Code to Visual Studio 2015Windows Developer
Visual Studio 2015 provides the best in class C++ development experience whether you are targeting Android, iOS, Linux, Windows, or IoT. With a good mix of demos and showcase for new C++ experiences, this talk goes over six great reasons why you should migrate to Visual Studio 2015 today.
"Optimization of a .NET application- is it simple ! / ?", Yevhen TatarynovFwdays
Optimization of .NET application seems complex and tied full task, but don’t hurry up with conclusions. Let’s look on several cases from real projects.
For this we:
look under the hood of an application from a real project;
define the metric for optimization;
choose the necessary tools;
find bottlenecks /memory leaks and best practice to resolve them.
We'll improve the application step by step and we’ll what with simple analysis and simple best practice we can significantly reduce total resources usage.
9 грудня відбувся вебінар “Why Should You Learn C++ in 2021-22?”
Розглянули, наскільки популярною є C/C++ і де її можна використовувати. Поговорили про основні переваги та недоліки цієї мови програмування. Розповіли, як розвивається C/C++ і, нарешті, ми зрозуміли, як почати вивчати C/C++.
Більше про захід: https://www.globallogic.com/ua/about/events/c-webinar-why-you-should-learn-c-in-2021-22/
Приємного перегляду і не забудьте залишити коментар про враження від вебінару!
1- Modeling Hierarchy
2- Creating Testbenches
Skills gained:
1- Reuse design units several times in a design hierarchy
2- Automate testing of design units
This is part of VHDL 360 course
Highlighted notes of:
Introduction to CUDA C: NVIDIA
Author: Blaise Barney
From: GPU Clusters, Lawrence Livermore National Laboratory
https://computing.llnl.gov/tutorials/linux_clusters/gpu/NVIDIA.Introduction_to_CUDA_C.1.pdf
Blaise Barney is a research scientist at Lawrence Livermore National Laboratory.
Applying Compiler Techniques to Iterate At Blazing SpeedPascal-Louis Perez
In this session, we will present real life applications of compiler techniques helping kaChing achieve ultra confidence and power its incredible 5 minutes commit-to-production cycle [1]. We'll talk about idempotency analysis [2], dependency detection, on the fly optimisations, automatic memoization [3], type unification [4] and more! This talk is not suitable for the faint-hearted... If you want to dive deep, learn about advanced JVM topics, devoure bytecode and see first hand applications of theoretical computer science, join us.
[1] http://eng.kaching.com/2010/05/deployment-infrastructure-for.html
[2] http://en.wikipedia.org/wiki/Idempotence
[3] http://en.wikipedia.org/wiki/Memoization
[4] http://eng.kaching.com/2009/10/unifying-type-parameters-in-java.html
The slide introduce some of the Rust concept that are necessary to write a kernel. Including wrapping an CSRs operation, locking mutable static variable, memory allocator, and pointer in Rust.
Please visit the project github to see the source code of the rrxv6 projects:
https://github.com/yodalee/rrxv6
In this slide, I introduced how Gameboy works and how to build a Gameboy emulator using Rust programming language. Also, I introduce how to migrate the Rust emulator to Webassembly, so that we can run the emulator using browser.
Video of presentation of this slide:
https://www.youtube.com/watch?v=LqcEg3IVziQ
Make A Shoot ‘Em Up Game with Amethyst FrameworkYodalee
A brief introduction to Rust, Amethyst game framework, and rust/WebAssembly. Focus on the ECS concept in amethyst framework and how to build a simple game by it.
You can build an old-fashioned Nixie tube clock by yourself. In this slide I introduce all the challenges and how I overcome them. I also publish all my design on website easyEDA and Github, you can use the design freely if you want to build your own one.
Use PEG to Write a Programming Language ParserYodalee
PEG is a replacement to CFG. It is more powerful and can be more precise. In this slide I give a short introduction to PEG, the concept behind a programming language. Finally I write a parser for our programming language simple.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSveerababupersonal22
It consists of cw radar and fmcw radar ,range measurement,if amplifier and fmcw altimeterThe CW radar operates using continuous wave transmission, while the FMCW radar employs frequency-modulated continuous wave technology. Range measurement is a crucial aspect of radar systems, providing information about the distance to a target. The IF amplifier plays a key role in signal processing, amplifying intermediate frequency signals for further analysis. The FMCW altimeter utilizes frequency-modulated continuous wave technology to accurately measure altitude above a reference point.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Water Industry Process Automation and Control Monthly - May 2024.pdf
COSCUP2023 RSA256 Verilator.pdf
1. Robust Verilog Testing Using
Verilator & SystemC & C++17
Yodalee <lc85301@gmail.com>
Yu-Sheng Lin <johnjohnlys@gmail.com>
Take RSA256 as an example
1
3. Outline
● Verilator is a good, opensource SystemVerilog (SV) simulation tool
○ Verilator compiles SystemVerilog into C++ class
○ Control the signals in C++ testbench is tedious
● Array and struct can simplify SV coding
○ Use C++17 to build structs, arrays works the same as SV
● Design patterns for signals using SystemC
○ Unify the control interface
○ Mapping them to SystemC sc_fifo
● Case study: RSA 256
3
4. WhyVerilator
● Open-sourced SystemVerilog simulation tool.
○ https://www.veripool.org/verilator/
○ https://github.com/verilator/verilator/
● Fast
● Decent SV support
● Free license.
○ Enable massively-parallel simulation.
○ Suitable for CI.
4
5. How Verilator works
5
module Mod(
input [7:0] i_data,
output output_ok,
output [12:0] o_data [2]
);
struct VMod {
u8 i_data;
u8 output_ok;
u16 o_data[2];
};
VMod m;
m.i_data = 45;
while (not m.output_ok) {}
EXPECT_EQ(m->o_data[0], 30);
Verilog
User C++ testbench
Generated C++
Simulation Binary
./run.exe
Gtest: 100 != 30
6. Challenges
6
module Mod(
input [7:0] i_data,
output output_ok,
output [12:0] o_data [2]
);
struct VMod {
u8 i_data;
u8 output_ok;
u16 o_data[2];
};
VMod m;
m.i_data = 45;
while (not m.output_ok) {}
EXPECT_EQ(m->o_data[0], 30);
Verilog
User C++ testbench
Generated C++
Simulation Binary
./run.exe
Gtest: 100 != 30
Too many signals to control
Solution: use struct
Bitwidth information loss
7. Use Struct to Simplify SV & Challenges
7
Verilog
Bitwidth information loss
Struct information loss [2]
Support struct v5.000+ [1]
module Mod(
input [7:0] i_data,
input [13:0] i_data2,
input i_data3
);
typedef struct {
logic [7:0] data;
logic [13:0] data2;
logic data3;
} ModIn ;
module Mod(
input ModIn i
);
SystemVerilog
class VMod {
u32 i;
};
Generated C++
[1] We use v5.006 right now.
[2] Verilator has intensive issue and PR about this.
rewrite
8. We need int/array/struct that works for SV and C++
● Challange: Information loss when converting from SV to Verilator
C++
○ Bitwidth & struct information
● Why?
○ The interface that Verilator will convert to is not standardized.
● Solution: Abstraction
○ Build a SV-compatible type system with C++17
8
typedef struct {
logic [7:0] data;
logic [13:0] data2;
logic data3;
} ModIn ;
class ModIn {
u32 i;
};
User code Adaptor
Generated C++
SV-C++ interface
9. 3 weapons mimicking the SV typing system
● vuint<11>
○ logic [10:0] sig;
○ Replace sc_uint
● varray<vuint<11>, 3, 4>
○ logic [10:0] sig [3][4];
○ std::array with multiple dimension
● vstruct (macro)
○ typedef struct packed { ... } iStruct;
○ C++17 based type reflection struct supporting $bits, $pack
9
← Example later
10. ● Why reinvent sc_uint<int>?
● sc_uint is virtual, we cannot memcpy
● sc_uint must link against libsystemc
● You need sc_biguint<int> for wide integers.
● vuint is strictly typed
○ vuint<10> == vuint<11> is disallowed, required by many Lint tools
● C++11 type deduction system and varadic length template
○ auto val = Concat(vuint<4>, vuint<99>, vuint<2>)
Arbitrary bitwidth integer
10
vuint<11> v
| |
logic [10:0] v
11. Array
● Just like std::array, but high-dim.
● Also support array of struct.
11
varray<vuint<11>,2,3> v
| |
logic [10:0] v [2][3];
or
logic [0:1][0:2][10:0] v ;
We treat them the same
12. vstruct (macro) : Verilog struct
● C++17 magic!
● We want to fully utilize C++ standard.
● In SystemVerilog:
○ $bit(oStruct) == 36
○ logic [$bit(oStruct)-1:0] v;
● Our C++ API allows us to:
○ vuint<bit<oStruct>> value;
○ vuint<36> value = packed(oStruct);
○ Also support: unpack, print json, Verilator I/O
struct iStruct {
vuint<3> a;
vuint<10> b;
};
struct oStruct {
vuint<10> sig;
varray<iStruct, 2> c;
};
12
13. vstruct enabled by type reflection (oversimplified)
● All you need is 1 line of macro for every struct.
● Based on boost::hana, you can loop through the struct...
struct oStruct {
vuint<10> sig;
varray<iStruct, 2> c;
MY_MACRO(sig, c);
};
bit<T>
constexpr unsigned b = 0;
for (int i = 0; i < T::N; ++i)
{
b += bit<get<i>()>;
}
return b;
packed(t)
return Concat(
packed(get<i>()) for i in range(N)
);
auto& get<0>() { return sig; }
auto& get<1>() { return c; }
constexpr unsigned N = 2;
13
Expand to...
(pseudocode)
Implement
15. How Verilator works
15
module Mod(
input [7:0] i_data,
output output_ok,
output [12:0] o_data [2]
);
struct VMod {
u8 i_data;
u8 output_ok;
u16 o_data[2];
};
VMod m;
m.i_data = 45;
while (not m.output_ok) {}
EXPECT_EQ(m->o_data[0], 30);
Verilog
User C++ testbench
Generated C++
Simulation Binary
./run.exe
Gtest: 100 != 30
How to drive
module?
How to feed data?
Verilog can have any
kind of interface
16. The Valid/Ready Protocol
16
● Used in ARM AXI, Intel Avalon... specification
● Cycle 1: Sender set valid to 0, no data to transfer.
● Cycle 2: Sender set valid to 1, having data to transfer, but Receiver
set ready as 0, so hold valid and data.
● Cycle 3: Receiver set ready to 1, so Sender can set valid to 0 in the
next cycle or send the next data.
Valid
Ready
Sender Receiver
17. The Valid/Ready Protocol
17
Sender should hold
valid before Receiver
accept it
Sender should not
change data before
Receiver accept it
Receiver can freely
set/reset the ready
18. We can abstract valid ready protocol by SystemC sc_fifo
● sc_fifo full → Valid = 1, Ready = 0
● sc_fifo empty → Valid = 0, Ready = 1
The Valid/Ready Protocol
18
Valid
Ready
Sender Receiver Sender Receiver
sc_fifo
19. SystemCAbstraction of Module
Limit module to be one input, one output.
SC_MODULE(Montgomery) {
sc_in_clk clk;
sc_fifo_in<MontgomeryIn> data_in;
sc_fifo_out<MontgomeryOut> data_out;
SC_CTOR(Montgomery) {
SC_THREAD(Thread);
}
void Thread();
};
Montgomery::Thread() {
while (true) {
MontgomeryIn in = data_in.read();
KeyType a = in.a;
KeyType b = in.b;
KeyType round_result(0);
…
data_out.write(round_result));
}
}
Define input/output with
structure or alias
19
sc_fifo as input/output interface
20. Translate SystemC to Verilog
Montgomery::Thread() {
while (true) {
MontgomeryIn in =
data_in.read();
KeyType a = in.a;
KeyType b = in.b;
KeyType round_result(0);
…
data_out.write(round_result));
}
}
20
The SystemC module is fully tested and translated to Verilog
Keep the module simple, otherwise it will be difficult to translate.
module Montgomery(
// input data
input i_valid,
output i_ready,
input MontgomeryIn i_in,
);
always_ff @(posedge clk) begin
if (i_ready && i_valid) begin
a <= {2'b0, i_in.a};
b <= {2'b0, i_in.b};
round_result <= 'b0;
end
end
…
sc_fifo translate to valid/ready
and optional data
21. Verilog Testbench
● Wrap the DUT class generated by Verilator
○ Assume that DUT has i_valid/i_ready and o_valid/o_ready
○ Testbench generates the clock with sc_clock
○ The driver/monitor implement before_clk and after_clk to control the
valid/ready
21
Testbench
Driver DUT
i_ready
i_valid
o_ready
o_valid
Monitor
22. Test Methodology
22
● At least 2 set of input/output data
○ More is better
○ Never just test 1 set of input/output
● Random
○ Deterministic input/output for basic correctness
○ Random input/output to find more issues
23. https://github.com/yodalee/rsa256
4 modules all with single input/single output.
Golden data generated by C model (or Python script)
1 to 1 mapping from SystemC to Verilog
The RSA256 Implementation
23
RSA256
plaintext
key
modulus ciphertext
TwoPower RSAMont
Montgomery
24. Conclusion
24
● The data type in SystemC is not suitable for simulating
Verilog
○ We create the vint, varray and vstructure.
○ The data type can directly map to SystemVerilog
● We design, implement and test RSA 256 modules, and
validate with Verilator
○ Abstraction over the interface, the designer (a.k.a me) can
focus on test data.
25. Some (Possible) Future Work
25
● Replace SystemC with pure C++ framework
● Support complex interface like AXI
● Really tapeout the chip with Skywater service