SlideShare a Scribd company logo
Program design and
analysis
mr.C.KARTHIKEYAN
AP/ECE/RMKCET
Program design and
analysis
Program-level performance analysis.
Optimizing for:
Execution time.
Energy/power.
Program size.
Program validation and testing.
Program-level performance
analysis
 Need to understand
performance in detail:
Real-time behavior, not
just typical.
On complex platforms.
 Program performance 
CPU performance:
Pipeline, cache are
windows into program.
We must analyze the entire
program.
Complexities of program
performance
Varies with input data:
Different-length paths.
Cache effects.
Instruction-level performance variations:
Pipeline interlocks.
Fetch times.
How to measure program
performance
Simulate execution of the CPU.
Makes CPU state visible.
Measure on real CPU using timer.
Requires modifying the program to control
the timer.
Measure on real CPU using logic analyzer.
Requires events visible on the pins.
Program performance
metrics
Average-case execution time.
Typically used in application programming.
Worst-case execution time.
A component in deadline satisfaction.
Best-case execution time.
Task-level interactions can cause best-case
program behavior to result in worst-case
system behavior.
Elements of program
performance
Basic program execution time formula:
execution time = program path + instruction timing
Solving these problems independently helps
simplify analysis.
Easier to separate on simpler CPUs.
Accurate performance analysis requires:
Assembly/binary code.
Execution platform.
Data-dependent paths in
an if statement
if (a || b) { /* T1 */
if ( c ) /* T2 */
x = r*s+t; /* A1 */
else y=r+s; /* A2 */
z = r+s+u; /* A3 */
}
else {
if ( c ) /* T3 */
y = r-t; /* A4 */
}
a b c path
0 0 0 T1=F, T3=F: no assignments
0 0 1 T1=F, T3=T: A4
0 1 0 T1=T, T2=F: A2, A3
0 1 1 T1=T, T2=T: A1, A3
1 0 0 T1=T, T2=F: A2, A3
1 0 1 T1=T, T2=T: A1, A3
1 1 0 T1=T, T2=F: A2, A3
1 1 1 T1=T, T2=T: A1, A3
Paths in a loop
for (i=0, f=0; i<N; i++)
f = f + c[i] * x[i];
i=0
f=0
i=N
f = f + c[i] * x[i]
i = i + 1
N
Y
Instruction timing
 Not all instructions take the same amount of time.
Multi-cycle instructions.
Fetches.
 Execution times of instructions are not independent.
Pipeline interlocks.
Cache effects.
 Execution times may vary with operand value.
Floating-point operations.
Mesaurement-driven
performance analysis
Program trace
Simulator
H/S
Profiling-it counts the no. of times that
procedures or basic blocks I the program
are executed
Physical measurement
In-circuit emulator allows tracing.
Affects execution timing.
Logic analyzer can measure behavior at pins.
Address bus can be analyzed to look for events.
Code can be modified to make events visible.
Particularly important for real-world input
streams.
CPU simulation
Some simulators are less accurate.
Cycle-accurate simulator provides
accurate clock-cycle timing.
Simulator models CPU internals.
Simulator writer must know how CPU works.
Performance optimization
motivation
Embedded systems must often meet
deadlines.
Faster may not be fast enough.
Need to be able to analyze execution
time.
Worst-case, not typical.
Need techniques for reliably improving
execution time.
Programs and performance
analysis
Best results come from analyzing
optimized instructions, not high-level
language code:
non-obvious translations of HLL statements
into instructions;
code may move;
cache effects are hard to predict.
SOFTWARE PERFORMNACE OPTIMIZATION
Loop optimizations
Loops are good targets for optimization.
Basic loop optimizations:
code motion;
induction-variable elimination;
strength reduction (x*2 -> x<<1).
Code motion
BEFORE OPTIMIZATION
for (i=0, f=0; i<N; i++)
{
x=y+z;
a[i]=6*i;
}
AFTER OPTIMIZATION
x=y+z;
for (i=0, f=0; i<N; i++)
{
a[i]=6*I;
}
Induction variable
elimination (nested loop)
BEFORE OPTIMIZATION
for (i=0; i<N; i++)
{
for (j=0; j<M; j++)
{
z[i] [j]= b[i] [j];
}
}
AFTER OPTIMIZATION
for (i=0; i<N; i++)
for (j=0; j<M; j++)
{
Zbinduct = i*M+j;
*(Zptr+Zbinduct)=*(bptr+
Z binduct);
}
Strength reduction
X=2
Y=x*2
ls 1
2->0010
Ls 1-> 0100
Cache analysis
Loop nest: set of loops, one inside other.
Perfect loop nest: no conditionals in nest.
Because loops use large quantities of
data, cache conflicts are common.
Array conflicts in cache
a[0,0]
b[0,0]
main memory cache
1024 4099
...
1024
4099
Array conflicts, cont’d.
Array elements conflict because they are
in the same line, even if not mapped to
same location.
Solutions:
move one array;
pad array.
Performance optimization
hints
Use registers efficiently.
Use page mode memory accesses.
Analyze cache behavior:
instruction conflicts can be handled by
rewriting code, rescheudling;
conflicting array data can be moved, padded.
PROGRAM LEVEL ENERGY AND POWER
ANALYSIS OPTIMIZATION
Energy/power optimization
Energy: ability to do work.
Most important in battery-powered systems.
Power: energy per unit time.
Important even in wall-plug systems---power
becomes heat.
Measuring energy
consumption
Execute a small loop, measure current:
while (TRUE)
a();
I
Cache behavior is
important
Energy consumption has a sweet spot as
cache size changes:
cache too small: program thrashes, burning
energy on external memory accesses;
cache too large: cache itself burns too much
power.
Optimizing for energy
First-order optimization:
high performance = low energy.
Not many instructions trade speed for
energy.
Optimizing for energy,
cont’d.
Use registers efficiently.
Identify and eliminate cache conflicts.
Moderate loop unrolling eliminates some
loop overhead instructions.
Eliminate pipeline stalls.
Inlining procedures may help: reduces
linkage, but may increase cache
thrashing.
ANALYSIS AND OPTIMIZATION OF
PROGRAM SIZE
Optimizing for program
size
Goal:
reduce hardware cost of memory;
reduce power consumption of memory units.
Two opportunities:
data;
instructions.
Data size minimization
Reuse constants, variables, data buffers in
different parts of code.
Requires careful verification of correctness.
Encapsulating functions in subroutines when
done carefully can reduce program size
PROGRAM VALIDATION AND
TESTING
Program validation and
testing
Concentrate here on functional verification.
Major testing strategies:
Black box doesn’t look at the source code.
Clear box (white box) does look at the
source code.
Clear-box testing
Examine the source code to determine whether
it works:
Can you actually exercise a path?
Do you get the value you expect along a path?
Testing procedure:
Controllability: rovide program with inputs.
Execute.
Observability: examine outputs.
Execution paths and
testing
Paths are important in functional testing
as well as performance analysis.
In general, an exponential number of
paths through the program.
Show that some paths dominate others.
Heuristically limit paths.
Choosing the paths to test
Possible criteria:
Execute every
statement at least
once.
Execute every branch
direction at least once.
Equivalent for
structured programs.
Not true for gotos.
not covered
Basis paths
Approximate CDFG
with undirected
graph.
Undirected graphs
have basis paths:
All paths are linear
combinations of basis
paths.
Cyclomatic complexity
Cyclomatic complexity
is a bound on the size
of basis sets:
e = # edges
n = # nodes
p = number of graph
components
M = e – n + 2p.
Branch testing
Heuristic for testing branches.
Exercise true and false branches of
conditional.
Exercise every simple condition at least once.
Domain testing
Heuristic test for
linear inequalities.
Test on each side +
boundary of
inequality.
Def-use pairs
Variable def-use:
Def when value is
assigned (defined).
Exercise each def-use
pair.
Loop testing
Loops need specialized tests to be tested
efficiently.
Heuristic testing strategy:
Skip loop entirely.
One loop iteration.
Two loop iterations.
# iterations much below max.
n-1, n, n+1 iterations where n is max.
Black-box testing
Complements clear-box testing.
May require a large number of tests.
Tests software in different ways.
Black-box test vectors
Random tests.
May weight distribution based on software
specification.
Regression tests.
Tests of previous versions, bugs, etc.
May be clear-box tests of previous versions.
What atlast?
 Must have the knowledge of compiler opt.
 Well worse in Embedded C
 To interpret the data structures whenever needed
 Must know the DFG and CDFG
 Wide knowledge on processor (target)
 Techniques in prog optimization
 Develop the code which occupies less memory space
with much features
 Smart usage of loops in the code

More Related Content

What's hot

Chap6 procedures &amp; macros
Chap6 procedures &amp; macrosChap6 procedures &amp; macros
Chap6 procedures &amp; macros
HarshitParkar6677
 
Peephole optimization techniques in compiler design
Peephole optimization techniques in compiler designPeephole optimization techniques in compiler design
Peephole optimization techniques in compiler design
Anul Chaudhary
 
Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.
MITS Gwalior
 
Introduction to OpenMP
Introduction to OpenMPIntroduction to OpenMP
Introduction to OpenMP
Akhila Prabhakaran
 
parallel language and compiler
parallel language and compilerparallel language and compiler
parallel language and compiler
Vignesh Tamil
 
Lect08
Lect08Lect08
Introduction to MPI
Introduction to MPIIntroduction to MPI
Introduction to MPI
Akhila Prabhakaran
 
Multi phase mixture media
Multi phase mixture mediaMulti phase mixture media
Multi phase mixture media
Modelon
 
Model Predictive Control Implementation with LabVIEW
Model Predictive Control Implementation with LabVIEWModel Predictive Control Implementation with LabVIEW
Model Predictive Control Implementation with LabVIEW
yurongwang1
 
Flow control in computer
Flow control in computerFlow control in computer
Flow control in computer
rud_d_rcks
 
Aca2 10 11
Aca2 10 11Aca2 10 11
Aca2 10 11
Sumit Mittu
 
Automatic Generation of Peephole Superoptimizers
Automatic Generation of Peephole SuperoptimizersAutomatic Generation of Peephole Superoptimizers
Automatic Generation of Peephole Superoptimizerskeanumit
 
Macro-processor
Macro-processorMacro-processor
Macro-processor
Temesgen Molla
 
Compiler Optimization-Space Exploration
Compiler Optimization-Space ExplorationCompiler Optimization-Space Exploration
Compiler Optimization-Space Explorationtmusabbir
 
Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing ModelAdlin Jeena
 
Systolic, Transposed & Semi-Parallel Architectures and Programming
Systolic, Transposed & Semi-Parallel Architectures and ProgrammingSystolic, Transposed & Semi-Parallel Architectures and Programming
Systolic, Transposed & Semi-Parallel Architectures and Programming
Sandip Jassar (sandipjassar@hotmail.com)
 
Data flow architecture
Data flow architectureData flow architecture
Data flow architecture
Sourav Routh
 
Programming using MPI and OpenMP
Programming using MPI and OpenMPProgramming using MPI and OpenMP
Programming using MPI and OpenMP
Divya Tiwari
 
TensorRT survey
TensorRT surveyTensorRT survey
TensorRT survey
Yi-Hsiu Hsu
 
The role of the cpu in the operation
The role of the cpu in the operationThe role of the cpu in the operation
The role of the cpu in the operationmary_ramsay
 

What's hot (20)

Chap6 procedures &amp; macros
Chap6 procedures &amp; macrosChap6 procedures &amp; macros
Chap6 procedures &amp; macros
 
Peephole optimization techniques in compiler design
Peephole optimization techniques in compiler designPeephole optimization techniques in compiler design
Peephole optimization techniques in compiler design
 
Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.
 
Introduction to OpenMP
Introduction to OpenMPIntroduction to OpenMP
Introduction to OpenMP
 
parallel language and compiler
parallel language and compilerparallel language and compiler
parallel language and compiler
 
Lect08
Lect08Lect08
Lect08
 
Introduction to MPI
Introduction to MPIIntroduction to MPI
Introduction to MPI
 
Multi phase mixture media
Multi phase mixture mediaMulti phase mixture media
Multi phase mixture media
 
Model Predictive Control Implementation with LabVIEW
Model Predictive Control Implementation with LabVIEWModel Predictive Control Implementation with LabVIEW
Model Predictive Control Implementation with LabVIEW
 
Flow control in computer
Flow control in computerFlow control in computer
Flow control in computer
 
Aca2 10 11
Aca2 10 11Aca2 10 11
Aca2 10 11
 
Automatic Generation of Peephole Superoptimizers
Automatic Generation of Peephole SuperoptimizersAutomatic Generation of Peephole Superoptimizers
Automatic Generation of Peephole Superoptimizers
 
Macro-processor
Macro-processorMacro-processor
Macro-processor
 
Compiler Optimization-Space Exploration
Compiler Optimization-Space ExplorationCompiler Optimization-Space Exploration
Compiler Optimization-Space Exploration
 
Parallel Programing Model
Parallel Programing ModelParallel Programing Model
Parallel Programing Model
 
Systolic, Transposed & Semi-Parallel Architectures and Programming
Systolic, Transposed & Semi-Parallel Architectures and ProgrammingSystolic, Transposed & Semi-Parallel Architectures and Programming
Systolic, Transposed & Semi-Parallel Architectures and Programming
 
Data flow architecture
Data flow architectureData flow architecture
Data flow architecture
 
Programming using MPI and OpenMP
Programming using MPI and OpenMPProgramming using MPI and OpenMP
Programming using MPI and OpenMP
 
TensorRT survey
TensorRT surveyTensorRT survey
TensorRT survey
 
The role of the cpu in the operation
The role of the cpu in the operationThe role of the cpu in the operation
The role of the cpu in the operation
 

Similar to Unit 3 part2

pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture design
ssuser87fa0c1
 
Load Balancing in Parallel and Distributed Database
Load Balancing in Parallel and Distributed DatabaseLoad Balancing in Parallel and Distributed Database
Load Balancing in Parallel and Distributed Database
Md. Shamsur Rahim
 
Algorithm Analysis.pdf
Algorithm Analysis.pdfAlgorithm Analysis.pdf
Algorithm Analysis.pdf
NayanChandak1
 
Training - What is Performance ?
Training  - What is Performance ?Training  - What is Performance ?
Training - What is Performance ?
Betclic Everest Group Tech Team
 
Vedic Calculator
Vedic CalculatorVedic Calculator
Vedic Calculator
divyang_panchasara
 
Code Optimization
Code OptimizationCode Optimization
Code Optimization
Akhil Kaushik
 
SE UNIT 5 part 2 (1).pptx
SE UNIT 5 part 2 (1).pptxSE UNIT 5 part 2 (1).pptx
SE UNIT 5 part 2 (1).pptx
PraneethBhai1
 
Cs 568 Spring 10 Lecture 5 Estimation
Cs 568 Spring 10  Lecture 5 EstimationCs 568 Spring 10  Lecture 5 Estimation
Cs 568 Spring 10 Lecture 5 Estimation
Lawrence Bernstein
 
Unit 1 python (2021 r)
Unit 1 python (2021 r)Unit 1 python (2021 r)
Unit 1 python (2021 r)
praveena p
 
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptxICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
johnsmith96441
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram
Praveen Penumathsa
 
Qat09 presentations dxw07u
Qat09 presentations dxw07uQat09 presentations dxw07u
Qat09 presentations dxw07uShubham Sharma
 
Design and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptxDesign and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptx
Syed Zaid Irshad
 
Lecture 3
Lecture 3Lecture 3
Lecture 3Mr SMAK
 
Computer architecture short note (version 8)
Computer architecture short note (version 8)Computer architecture short note (version 8)
Computer architecture short note (version 8)
Nimmi Weeraddana
 
SecondPresentationDesigning_Parallel_Programs.ppt
SecondPresentationDesigning_Parallel_Programs.pptSecondPresentationDesigning_Parallel_Programs.ppt
SecondPresentationDesigning_Parallel_Programs.ppt
RubenGabrielHernande
 

Similar to Unit 3 part2 (20)

pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture design
 
Load Balancing in Parallel and Distributed Database
Load Balancing in Parallel and Distributed DatabaseLoad Balancing in Parallel and Distributed Database
Load Balancing in Parallel and Distributed Database
 
Algorithm Analysis.pdf
Algorithm Analysis.pdfAlgorithm Analysis.pdf
Algorithm Analysis.pdf
 
Training - What is Performance ?
Training  - What is Performance ?Training  - What is Performance ?
Training - What is Performance ?
 
Vedic Calculator
Vedic CalculatorVedic Calculator
Vedic Calculator
 
Code Optimization
Code OptimizationCode Optimization
Code Optimization
 
SE UNIT 5 part 2 (1).pptx
SE UNIT 5 part 2 (1).pptxSE UNIT 5 part 2 (1).pptx
SE UNIT 5 part 2 (1).pptx
 
Cs 568 Spring 10 Lecture 5 Estimation
Cs 568 Spring 10  Lecture 5 EstimationCs 568 Spring 10  Lecture 5 Estimation
Cs 568 Spring 10 Lecture 5 Estimation
 
Unit 1 python (2021 r)
Unit 1 python (2021 r)Unit 1 python (2021 r)
Unit 1 python (2021 r)
 
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptxICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram
 
Qat09 presentations dxw07u
Qat09 presentations dxw07uQat09 presentations dxw07u
Qat09 presentations dxw07u
 
1.prallelism
1.prallelism1.prallelism
1.prallelism
 
1.prallelism
1.prallelism1.prallelism
1.prallelism
 
Design and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptxDesign and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptx
 
Lecture1
Lecture1Lecture1
Lecture1
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Computer architecture short note (version 8)
Computer architecture short note (version 8)Computer architecture short note (version 8)
Computer architecture short note (version 8)
 
Testing
TestingTesting
Testing
 
SecondPresentationDesigning_Parallel_Programs.ppt
SecondPresentationDesigning_Parallel_Programs.pptSecondPresentationDesigning_Parallel_Programs.ppt
SecondPresentationDesigning_Parallel_Programs.ppt
 

More from Karthik Vivek

Peak detector, instrumentation amp
Peak detector, instrumentation ampPeak detector, instrumentation amp
Peak detector, instrumentation amp
Karthik Vivek
 
U3 op amp applications
U3 op amp applicationsU3 op amp applications
U3 op amp applications
Karthik Vivek
 
Unit 1 ic fab
Unit 1 ic fabUnit 1 ic fab
Unit 1 ic fab
Karthik Vivek
 
Fabrication of diodes, resistors, capacitors, fe ts
Fabrication of diodes, resistors, capacitors, fe tsFabrication of diodes, resistors, capacitors, fe ts
Fabrication of diodes, resistors, capacitors, fe ts
Karthik Vivek
 
Unit 3 part2
Unit 3 part2Unit 3 part2
Unit 3 part2
Karthik Vivek
 
Compiler optimization
Compiler optimizationCompiler optimization
Compiler optimization
Karthik Vivek
 
Embedded programming u3 part 1
Embedded programming u3 part 1Embedded programming u3 part 1
Embedded programming u3 part 1
Karthik Vivek
 
ARM stacks, subroutines, Cortex M3, LPC 214X
ARM  stacks, subroutines, Cortex M3, LPC 214XARM  stacks, subroutines, Cortex M3, LPC 214X
ARM stacks, subroutines, Cortex M3, LPC 214X
Karthik Vivek
 
ARM inst set part 2
ARM inst set part 2ARM inst set part 2
ARM inst set part 2
Karthik Vivek
 
ARM instruction set
ARM instruction  setARM instruction  set
ARM instruction set
Karthik Vivek
 
ARM instruction set
ARM instruction  setARM instruction  set
ARM instruction set
Karthik Vivek
 
ARM Versions, architecture
ARM Versions, architectureARM Versions, architecture
ARM Versions, architecture
Karthik Vivek
 
Unit 1a train
Unit 1a trainUnit 1a train
Unit 1a train
Karthik Vivek
 
Unit2 arm
Unit2 armUnit2 arm
Unit2 arm
Karthik Vivek
 
Unit 1c
Unit 1cUnit 1c
Unit 1c
Karthik Vivek
 
Unit 1b
Unit 1bUnit 1b
Unit 1b
Karthik Vivek
 
Unit 1a train
Unit 1a trainUnit 1a train
Unit 1a train
Karthik Vivek
 
Introduction
IntroductionIntroduction
Introduction
Karthik Vivek
 
unit 2- OP AMP APPLICATIONS
unit 2- OP AMP APPLICATIONSunit 2- OP AMP APPLICATIONS
unit 2- OP AMP APPLICATIONS
Karthik Vivek
 
VLSI DESIGN- MOS TRANSISTOR
VLSI DESIGN- MOS TRANSISTORVLSI DESIGN- MOS TRANSISTOR
VLSI DESIGN- MOS TRANSISTOR
Karthik Vivek
 

More from Karthik Vivek (20)

Peak detector, instrumentation amp
Peak detector, instrumentation ampPeak detector, instrumentation amp
Peak detector, instrumentation amp
 
U3 op amp applications
U3 op amp applicationsU3 op amp applications
U3 op amp applications
 
Unit 1 ic fab
Unit 1 ic fabUnit 1 ic fab
Unit 1 ic fab
 
Fabrication of diodes, resistors, capacitors, fe ts
Fabrication of diodes, resistors, capacitors, fe tsFabrication of diodes, resistors, capacitors, fe ts
Fabrication of diodes, resistors, capacitors, fe ts
 
Unit 3 part2
Unit 3 part2Unit 3 part2
Unit 3 part2
 
Compiler optimization
Compiler optimizationCompiler optimization
Compiler optimization
 
Embedded programming u3 part 1
Embedded programming u3 part 1Embedded programming u3 part 1
Embedded programming u3 part 1
 
ARM stacks, subroutines, Cortex M3, LPC 214X
ARM  stacks, subroutines, Cortex M3, LPC 214XARM  stacks, subroutines, Cortex M3, LPC 214X
ARM stacks, subroutines, Cortex M3, LPC 214X
 
ARM inst set part 2
ARM inst set part 2ARM inst set part 2
ARM inst set part 2
 
ARM instruction set
ARM instruction  setARM instruction  set
ARM instruction set
 
ARM instruction set
ARM instruction  setARM instruction  set
ARM instruction set
 
ARM Versions, architecture
ARM Versions, architectureARM Versions, architecture
ARM Versions, architecture
 
Unit 1a train
Unit 1a trainUnit 1a train
Unit 1a train
 
Unit2 arm
Unit2 armUnit2 arm
Unit2 arm
 
Unit 1c
Unit 1cUnit 1c
Unit 1c
 
Unit 1b
Unit 1bUnit 1b
Unit 1b
 
Unit 1a train
Unit 1a trainUnit 1a train
Unit 1a train
 
Introduction
IntroductionIntroduction
Introduction
 
unit 2- OP AMP APPLICATIONS
unit 2- OP AMP APPLICATIONSunit 2- OP AMP APPLICATIONS
unit 2- OP AMP APPLICATIONS
 
VLSI DESIGN- MOS TRANSISTOR
VLSI DESIGN- MOS TRANSISTORVLSI DESIGN- MOS TRANSISTOR
VLSI DESIGN- MOS TRANSISTOR
 

Recently uploaded

The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Po-Chuan Chen
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 

Recently uploaded (20)

The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdfAdversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
Adversarial Attention Modeling for Multi-dimensional Emotion Regression.pdf
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 

Unit 3 part2

  • 2. Program design and analysis Program-level performance analysis. Optimizing for: Execution time. Energy/power. Program size. Program validation and testing.
  • 3. Program-level performance analysis  Need to understand performance in detail: Real-time behavior, not just typical. On complex platforms.  Program performance  CPU performance: Pipeline, cache are windows into program. We must analyze the entire program.
  • 4. Complexities of program performance Varies with input data: Different-length paths. Cache effects. Instruction-level performance variations: Pipeline interlocks. Fetch times.
  • 5. How to measure program performance Simulate execution of the CPU. Makes CPU state visible. Measure on real CPU using timer. Requires modifying the program to control the timer. Measure on real CPU using logic analyzer. Requires events visible on the pins.
  • 6. Program performance metrics Average-case execution time. Typically used in application programming. Worst-case execution time. A component in deadline satisfaction. Best-case execution time. Task-level interactions can cause best-case program behavior to result in worst-case system behavior.
  • 7. Elements of program performance Basic program execution time formula: execution time = program path + instruction timing Solving these problems independently helps simplify analysis. Easier to separate on simpler CPUs. Accurate performance analysis requires: Assembly/binary code. Execution platform.
  • 8. Data-dependent paths in an if statement if (a || b) { /* T1 */ if ( c ) /* T2 */ x = r*s+t; /* A1 */ else y=r+s; /* A2 */ z = r+s+u; /* A3 */ } else { if ( c ) /* T3 */ y = r-t; /* A4 */ } a b c path 0 0 0 T1=F, T3=F: no assignments 0 0 1 T1=F, T3=T: A4 0 1 0 T1=T, T2=F: A2, A3 0 1 1 T1=T, T2=T: A1, A3 1 0 0 T1=T, T2=F: A2, A3 1 0 1 T1=T, T2=T: A1, A3 1 1 0 T1=T, T2=F: A2, A3 1 1 1 T1=T, T2=T: A1, A3
  • 9. Paths in a loop for (i=0, f=0; i<N; i++) f = f + c[i] * x[i]; i=0 f=0 i=N f = f + c[i] * x[i] i = i + 1 N Y
  • 10. Instruction timing  Not all instructions take the same amount of time. Multi-cycle instructions. Fetches.  Execution times of instructions are not independent. Pipeline interlocks. Cache effects.  Execution times may vary with operand value. Floating-point operations.
  • 11. Mesaurement-driven performance analysis Program trace Simulator H/S Profiling-it counts the no. of times that procedures or basic blocks I the program are executed
  • 12. Physical measurement In-circuit emulator allows tracing. Affects execution timing. Logic analyzer can measure behavior at pins. Address bus can be analyzed to look for events. Code can be modified to make events visible. Particularly important for real-world input streams.
  • 13. CPU simulation Some simulators are less accurate. Cycle-accurate simulator provides accurate clock-cycle timing. Simulator models CPU internals. Simulator writer must know how CPU works.
  • 14. Performance optimization motivation Embedded systems must often meet deadlines. Faster may not be fast enough. Need to be able to analyze execution time. Worst-case, not typical. Need techniques for reliably improving execution time.
  • 15. Programs and performance analysis Best results come from analyzing optimized instructions, not high-level language code: non-obvious translations of HLL statements into instructions; code may move; cache effects are hard to predict.
  • 17. Loop optimizations Loops are good targets for optimization. Basic loop optimizations: code motion; induction-variable elimination; strength reduction (x*2 -> x<<1).
  • 18. Code motion BEFORE OPTIMIZATION for (i=0, f=0; i<N; i++) { x=y+z; a[i]=6*i; } AFTER OPTIMIZATION x=y+z; for (i=0, f=0; i<N; i++) { a[i]=6*I; }
  • 19. Induction variable elimination (nested loop) BEFORE OPTIMIZATION for (i=0; i<N; i++) { for (j=0; j<M; j++) { z[i] [j]= b[i] [j]; } } AFTER OPTIMIZATION for (i=0; i<N; i++) for (j=0; j<M; j++) { Zbinduct = i*M+j; *(Zptr+Zbinduct)=*(bptr+ Z binduct); }
  • 21. Cache analysis Loop nest: set of loops, one inside other. Perfect loop nest: no conditionals in nest. Because loops use large quantities of data, cache conflicts are common.
  • 22. Array conflicts in cache a[0,0] b[0,0] main memory cache 1024 4099 ... 1024 4099
  • 23. Array conflicts, cont’d. Array elements conflict because they are in the same line, even if not mapped to same location. Solutions: move one array; pad array.
  • 24. Performance optimization hints Use registers efficiently. Use page mode memory accesses. Analyze cache behavior: instruction conflicts can be handled by rewriting code, rescheudling; conflicting array data can be moved, padded.
  • 25. PROGRAM LEVEL ENERGY AND POWER ANALYSIS OPTIMIZATION
  • 26. Energy/power optimization Energy: ability to do work. Most important in battery-powered systems. Power: energy per unit time. Important even in wall-plug systems---power becomes heat.
  • 27. Measuring energy consumption Execute a small loop, measure current: while (TRUE) a(); I
  • 28. Cache behavior is important Energy consumption has a sweet spot as cache size changes: cache too small: program thrashes, burning energy on external memory accesses; cache too large: cache itself burns too much power.
  • 29. Optimizing for energy First-order optimization: high performance = low energy. Not many instructions trade speed for energy.
  • 30. Optimizing for energy, cont’d. Use registers efficiently. Identify and eliminate cache conflicts. Moderate loop unrolling eliminates some loop overhead instructions. Eliminate pipeline stalls. Inlining procedures may help: reduces linkage, but may increase cache thrashing.
  • 31. ANALYSIS AND OPTIMIZATION OF PROGRAM SIZE
  • 32. Optimizing for program size Goal: reduce hardware cost of memory; reduce power consumption of memory units. Two opportunities: data; instructions.
  • 33. Data size minimization Reuse constants, variables, data buffers in different parts of code. Requires careful verification of correctness. Encapsulating functions in subroutines when done carefully can reduce program size
  • 35. Program validation and testing Concentrate here on functional verification. Major testing strategies: Black box doesn’t look at the source code. Clear box (white box) does look at the source code.
  • 36. Clear-box testing Examine the source code to determine whether it works: Can you actually exercise a path? Do you get the value you expect along a path? Testing procedure: Controllability: rovide program with inputs. Execute. Observability: examine outputs.
  • 37. Execution paths and testing Paths are important in functional testing as well as performance analysis. In general, an exponential number of paths through the program. Show that some paths dominate others. Heuristically limit paths.
  • 38. Choosing the paths to test Possible criteria: Execute every statement at least once. Execute every branch direction at least once. Equivalent for structured programs. Not true for gotos. not covered
  • 39. Basis paths Approximate CDFG with undirected graph. Undirected graphs have basis paths: All paths are linear combinations of basis paths.
  • 40. Cyclomatic complexity Cyclomatic complexity is a bound on the size of basis sets: e = # edges n = # nodes p = number of graph components M = e – n + 2p.
  • 41. Branch testing Heuristic for testing branches. Exercise true and false branches of conditional. Exercise every simple condition at least once.
  • 42. Domain testing Heuristic test for linear inequalities. Test on each side + boundary of inequality.
  • 43. Def-use pairs Variable def-use: Def when value is assigned (defined). Exercise each def-use pair.
  • 44. Loop testing Loops need specialized tests to be tested efficiently. Heuristic testing strategy: Skip loop entirely. One loop iteration. Two loop iterations. # iterations much below max. n-1, n, n+1 iterations where n is max.
  • 45. Black-box testing Complements clear-box testing. May require a large number of tests. Tests software in different ways.
  • 46. Black-box test vectors Random tests. May weight distribution based on software specification. Regression tests. Tests of previous versions, bugs, etc. May be clear-box tests of previous versions.
  • 47. What atlast?  Must have the knowledge of compiler opt.  Well worse in Embedded C  To interpret the data structures whenever needed  Must know the DFG and CDFG  Wide knowledge on processor (target)  Techniques in prog optimization  Develop the code which occupies less memory space with much features  Smart usage of loops in the code