SlideShare a Scribd company logo
1 of 47
Program design and
analysis
mr.C.KARTHIKEYAN
AP/ECE/RMKCET
Program design and
analysis
Program-level performance analysis.
Optimizing for:
Execution time.
Energy/power.
Program size.
Program validation and testing.
Program-level performance
analysis
 Need to understand
performance in detail:
Real-time behavior, not
just typical.
On complex platforms.
 Program performance 
CPU performance:
Pipeline, cache are
windows into program.
We must analyze the entire
program.
Complexities of program
performance
Varies with input data:
Different-length paths.
Cache effects.
Instruction-level performance variations:
Pipeline interlocks.
Fetch times.
How to measure program
performance
Simulate execution of the CPU.
Makes CPU state visible.
Measure on real CPU using timer.
Requires modifying the program to control
the timer.
Measure on real CPU using logic analyzer.
Requires events visible on the pins.
Program performance
metrics
Average-case execution time.
Typically used in application programming.
Worst-case execution time.
A component in deadline satisfaction.
Best-case execution time.
Task-level interactions can cause best-case
program behavior to result in worst-case
system behavior.
Elements of program
performance
Basic program execution time formula:
execution time = program path + instruction timing
Solving these problems independently helps
simplify analysis.
Easier to separate on simpler CPUs.
Accurate performance analysis requires:
Assembly/binary code.
Execution platform.
Data-dependent paths in
an if statement
if (a || b) { /* T1 */
if ( c ) /* T2 */
x = r*s+t; /* A1 */
else y=r+s; /* A2 */
z = r+s+u; /* A3 */
}
else {
if ( c ) /* T3 */
y = r-t; /* A4 */
}
a b c path
0 0 0 T1=F, T3=F: no assignments
0 0 1 T1=F, T3=T: A4
0 1 0 T1=T, T2=F: A2, A3
0 1 1 T1=T, T2=T: A1, A3
1 0 0 T1=T, T2=F: A2, A3
1 0 1 T1=T, T2=T: A1, A3
1 1 0 T1=T, T2=F: A2, A3
1 1 1 T1=T, T2=T: A1, A3
Paths in a loop
for (i=0, f=0; i<N; i++)
f = f + c[i] * x[i];
i=0
f=0
i=N
f = f + c[i] * x[i]
i = i + 1
N
Y
Instruction timing
 Not all instructions take the same amount of time.
Multi-cycle instructions.
Fetches.
 Execution times of instructions are not independent.
Pipeline interlocks.
Cache effects.
 Execution times may vary with operand value.
Floating-point operations.
Mesaurement-driven
performance analysis
Program trace
Simulator
H/S
Profiling-it counts the no. of times that
procedures or basic blocks I the program
are executed
Physical measurement
In-circuit emulator allows tracing.
Affects execution timing.
Logic analyzer can measure behavior at pins.
Address bus can be analyzed to look for events.
Code can be modified to make events visible.
Particularly important for real-world input
streams.
CPU simulation
Some simulators are less accurate.
Cycle-accurate simulator provides
accurate clock-cycle timing.
Simulator models CPU internals.
Simulator writer must know how CPU works.
Performance optimization
motivation
Embedded systems must often meet
deadlines.
Faster may not be fast enough.
Need to be able to analyze execution
time.
Worst-case, not typical.
Need techniques for reliably improving
execution time.
Programs and performance
analysis
Best results come from analyzing
optimized instructions, not high-level
language code:
non-obvious translations of HLL statements
into instructions;
code may move;
cache effects are hard to predict.
SOFTWARE PERFORMNACE OPTIMIZATION
Loop optimizations
Loops are good targets for optimization.
Basic loop optimizations:
code motion;
induction-variable elimination;
strength reduction (x*2 -> x<<1).
Code motion
BEFORE OPTIMIZATION
for (i=0, f=0; i<N; i++)
{
x=y+z;
a[i]=6*i;
}
AFTER OPTIMIZATION
x=y+z;
for (i=0, f=0; i<N; i++)
{
a[i]=6*I;
}
Induction variable
elimination (nested loop)
BEFORE OPTIMIZATION
for (i=0; i<N; i++)
{
for (j=0; j<M; j++)
{
z[i] [j]= b[i] [j];
}
}
AFTER OPTIMIZATION
for (i=0; i<N; i++)
for (j=0; j<M; j++)
{
Zbinduct = i*M+j;
*(Zptr+Zbinduct)=*(bptr+
Z binduct);
}
Strength reduction
X=2
Y=x*2
ls 1
2->0010
Ls 1-> 0100
Cache analysis
Loop nest: set of loops, one inside other.
Perfect loop nest: no conditionals in nest.
Because loops use large quantities of
data, cache conflicts are common.
Array conflicts in cache
a[0,0]
b[0,0]
main memory cache
1024 4099
...
1024
4099
Array conflicts, cont’d.
Array elements conflict because they are
in the same line, even if not mapped to
same location.
Solutions:
move one array;
pad array.
Performance optimization
hints
Use registers efficiently.
Use page mode memory accesses.
Analyze cache behavior:
instruction conflicts can be handled by
rewriting code, rescheudling;
conflicting array data can be moved, padded.
PROGRAM LEVEL ENERGY AND POWER
ANALYSIS OPTIMIZATION
Energy/power optimization
Energy: ability to do work.
Most important in battery-powered systems.
Power: energy per unit time.
Important even in wall-plug systems---power
becomes heat.
Measuring energy
consumption
Execute a small loop, measure current:
while (TRUE)
a();
I
Cache behavior is
important
Energy consumption has a sweet spot as
cache size changes:
cache too small: program thrashes, burning
energy on external memory accesses;
cache too large: cache itself burns too much
power.
Optimizing for energy
First-order optimization:
high performance = low energy.
Not many instructions trade speed for
energy.
Optimizing for energy,
cont’d.
Use registers efficiently.
Identify and eliminate cache conflicts.
Moderate loop unrolling eliminates some
loop overhead instructions.
Eliminate pipeline stalls.
Inlining procedures may help: reduces
linkage, but may increase cache
thrashing.
ANALYSIS AND OPTIMIZATION OF
PROGRAM SIZE
Optimizing for program
size
Goal:
reduce hardware cost of memory;
reduce power consumption of memory units.
Two opportunities:
data;
instructions.
Data size minimization
Reuse constants, variables, data buffers in
different parts of code.
Requires careful verification of correctness.
Encapsulating functions in subroutines when
done carefully can reduce program size
PROGRAM VALIDATION AND
TESTING
Program validation and
testing
Concentrate here on functional verification.
Major testing strategies:
Black box doesn’t look at the source code.
Clear box (white box) does look at the
source code.
Clear-box testing
Examine the source code to determine whether
it works:
Can you actually exercise a path?
Do you get the value you expect along a path?
Testing procedure:
Controllability: rovide program with inputs.
Execute.
Observability: examine outputs.
Execution paths and
testing
Paths are important in functional testing
as well as performance analysis.
In general, an exponential number of
paths through the program.
Show that some paths dominate others.
Heuristically limit paths.
Choosing the paths to test
Possible criteria:
Execute every
statement at least
once.
Execute every branch
direction at least once.
Equivalent for
structured programs.
Not true for gotos.
not covered
Basis paths
Approximate CDFG
with undirected
graph.
Undirected graphs
have basis paths:
All paths are linear
combinations of basis
paths.
Cyclomatic complexity
Cyclomatic complexity
is a bound on the size
of basis sets:
e = # edges
n = # nodes
p = number of graph
components
M = e – n + 2p.
Branch testing
Heuristic for testing branches.
Exercise true and false branches of
conditional.
Exercise every simple condition at least once.
Domain testing
Heuristic test for
linear inequalities.
Test on each side +
boundary of
inequality.
Def-use pairs
Variable def-use:
Def when value is
assigned (defined).
Exercise each def-use
pair.
Loop testing
Loops need specialized tests to be tested
efficiently.
Heuristic testing strategy:
Skip loop entirely.
One loop iteration.
Two loop iterations.
# iterations much below max.
n-1, n, n+1 iterations where n is max.
Black-box testing
Complements clear-box testing.
May require a large number of tests.
Tests software in different ways.
Black-box test vectors
Random tests.
May weight distribution based on software
specification.
Regression tests.
Tests of previous versions, bugs, etc.
May be clear-box tests of previous versions.
What atlast?
 Must have the knowledge of compiler opt.
 Well worse in Embedded C
 To interpret the data structures whenever needed
 Must know the DFG and CDFG
 Wide knowledge on processor (target)
 Techniques in prog optimization
 Develop the code which occupies less memory space
with much features
 Smart usage of loops in the code

More Related Content

What's hot

A framework for nonlinear model predictive control
A framework for nonlinear model predictive controlA framework for nonlinear model predictive control
A framework for nonlinear model predictive controlModelon
 
Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) A B Shinde
 
parallel language and compiler
parallel language and compilerparallel language and compiler
parallel language and compilerVignesh Tamil
 
Flow control in computer
Flow control in computerFlow control in computer
Flow control in computerrud_d_rcks
 
[Capella Day 2019] Model execution and system simulation in Capella
[Capella Day 2019] Model execution and system simulation in Capella[Capella Day 2019] Model execution and system simulation in Capella
[Capella Day 2019] Model execution and system simulation in CapellaObeo
 
Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.MITS Gwalior
 
program partitioning and scheduling IN Advanced Computer Architecture
program partitioning and scheduling  IN Advanced Computer Architectureprogram partitioning and scheduling  IN Advanced Computer Architecture
program partitioning and scheduling IN Advanced Computer ArchitecturePankaj Kumar Jain
 
Model Predictive Control Implementation with LabVIEW
Model Predictive Control Implementation with LabVIEWModel Predictive Control Implementation with LabVIEW
Model Predictive Control Implementation with LabVIEWyurongwang1
 
Lecture 3
Lecture 3Lecture 3
Lecture 3Mr SMAK
 
Xilinx timing closure
Xilinx timing closureXilinx timing closure
Xilinx timing closureSampath Reddy
 
Data flow architecture
Data flow architectureData flow architecture
Data flow architectureSourav Routh
 
Multi phase mixture media
Multi phase mixture mediaMulti phase mixture media
Multi phase mixture mediaModelon
 
2 parallel processing presentation ph d 1st semester
2 parallel processing presentation ph d 1st semester2 parallel processing presentation ph d 1st semester
2 parallel processing presentation ph d 1st semesterRafi Ullah
 
Vector Supercomputers and Scientific Array Processors
Vector Supercomputers and Scientific Array ProcessorsVector Supercomputers and Scientific Array Processors
Vector Supercomputers and Scientific Array ProcessorsHsuvas Borkakoty
 
Performance measures
Performance measuresPerformance measures
Performance measuresDivya Tiwari
 
Bulk-Synchronous-Parallel - BSP
Bulk-Synchronous-Parallel - BSPBulk-Synchronous-Parallel - BSP
Bulk-Synchronous-Parallel - BSPMd Syed Ahamad
 

What's hot (20)

A framework for nonlinear model predictive control
A framework for nonlinear model predictive controlA framework for nonlinear model predictive control
A framework for nonlinear model predictive control
 
Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism) Pipelining and ILP (Instruction Level Parallelism)
Pipelining and ILP (Instruction Level Parallelism)
 
parallel language and compiler
parallel language and compilerparallel language and compiler
parallel language and compiler
 
Flow control in computer
Flow control in computerFlow control in computer
Flow control in computer
 
[Capella Day 2019] Model execution and system simulation in Capella
[Capella Day 2019] Model execution and system simulation in Capella[Capella Day 2019] Model execution and system simulation in Capella
[Capella Day 2019] Model execution and system simulation in Capella
 
Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.Parallel programming model, language and compiler in ACA.
Parallel programming model, language and compiler in ACA.
 
program partitioning and scheduling IN Advanced Computer Architecture
program partitioning and scheduling  IN Advanced Computer Architectureprogram partitioning and scheduling  IN Advanced Computer Architecture
program partitioning and scheduling IN Advanced Computer Architecture
 
Model Predictive Control Implementation with LabVIEW
Model Predictive Control Implementation with LabVIEWModel Predictive Control Implementation with LabVIEW
Model Predictive Control Implementation with LabVIEW
 
Lecture 3
Lecture 3Lecture 3
Lecture 3
 
Xilinx timing closure
Xilinx timing closureXilinx timing closure
Xilinx timing closure
 
Data flow architecture
Data flow architectureData flow architecture
Data flow architecture
 
Multi phase mixture media
Multi phase mixture mediaMulti phase mixture media
Multi phase mixture media
 
Start MPC
Start MPC Start MPC
Start MPC
 
2 parallel processing presentation ph d 1st semester
2 parallel processing presentation ph d 1st semester2 parallel processing presentation ph d 1st semester
2 parallel processing presentation ph d 1st semester
 
Vector Supercomputers and Scientific Array Processors
Vector Supercomputers and Scientific Array ProcessorsVector Supercomputers and Scientific Array Processors
Vector Supercomputers and Scientific Array Processors
 
Lect08
Lect08Lect08
Lect08
 
Calfem34
Calfem34Calfem34
Calfem34
 
Performance measures
Performance measuresPerformance measures
Performance measures
 
Aca2 10 11
Aca2 10 11Aca2 10 11
Aca2 10 11
 
Bulk-Synchronous-Parallel - BSP
Bulk-Synchronous-Parallel - BSPBulk-Synchronous-Parallel - BSP
Bulk-Synchronous-Parallel - BSP
 

Similar to Unit 3 part2

pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture designssuser87fa0c1
 
Load Balancing in Parallel and Distributed Database
Load Balancing in Parallel and Distributed DatabaseLoad Balancing in Parallel and Distributed Database
Load Balancing in Parallel and Distributed DatabaseMd. Shamsur Rahim
 
Algorithm Analysis.pdf
Algorithm Analysis.pdfAlgorithm Analysis.pdf
Algorithm Analysis.pdfNayanChandak1
 
SE UNIT 5 part 2 (1).pptx
SE UNIT 5 part 2 (1).pptxSE UNIT 5 part 2 (1).pptx
SE UNIT 5 part 2 (1).pptxPraneethBhai1
 
Cs 568 Spring 10 Lecture 5 Estimation
Cs 568 Spring 10  Lecture 5 EstimationCs 568 Spring 10  Lecture 5 Estimation
Cs 568 Spring 10 Lecture 5 EstimationLawrence Bernstein
 
Unit 1 python (2021 r)
Unit 1 python (2021 r)Unit 1 python (2021 r)
Unit 1 python (2021 r)praveena p
 
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptxICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptxjohnsmith96441
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Praveen Penumathsa
 
Qat09 presentations dxw07u
Qat09 presentations dxw07uQat09 presentations dxw07u
Qat09 presentations dxw07uShubham Sharma
 
Design and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptxDesign and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptxSyed Zaid Irshad
 
Computer architecture short note (version 8)
Computer architecture short note (version 8)Computer architecture short note (version 8)
Computer architecture short note (version 8)Nimmi Weeraddana
 
SecondPresentationDesigning_Parallel_Programs.ppt
SecondPresentationDesigning_Parallel_Programs.pptSecondPresentationDesigning_Parallel_Programs.ppt
SecondPresentationDesigning_Parallel_Programs.pptRubenGabrielHernande
 
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy Ehsan Sharifi
 

Similar to Unit 3 part2 (20)

pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture design
 
Load Balancing in Parallel and Distributed Database
Load Balancing in Parallel and Distributed DatabaseLoad Balancing in Parallel and Distributed Database
Load Balancing in Parallel and Distributed Database
 
Algorithm Analysis.pdf
Algorithm Analysis.pdfAlgorithm Analysis.pdf
Algorithm Analysis.pdf
 
Training - What is Performance ?
Training  - What is Performance ?Training  - What is Performance ?
Training - What is Performance ?
 
Vedic Calculator
Vedic CalculatorVedic Calculator
Vedic Calculator
 
Code Optimization
Code OptimizationCode Optimization
Code Optimization
 
SE UNIT 5 part 2 (1).pptx
SE UNIT 5 part 2 (1).pptxSE UNIT 5 part 2 (1).pptx
SE UNIT 5 part 2 (1).pptx
 
Cs 568 Spring 10 Lecture 5 Estimation
Cs 568 Spring 10  Lecture 5 EstimationCs 568 Spring 10  Lecture 5 Estimation
Cs 568 Spring 10 Lecture 5 Estimation
 
Unit 1 python (2021 r)
Unit 1 python (2021 r)Unit 1 python (2021 r)
Unit 1 python (2021 r)
 
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptxICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
ICS 2410.Parallel.Sytsems.Lecture.Week 3.week5.pptx
 
Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram Generating test cases using UML Communication Diagram
Generating test cases using UML Communication Diagram
 
Qat09 presentations dxw07u
Qat09 presentations dxw07uQat09 presentations dxw07u
Qat09 presentations dxw07u
 
1.prallelism
1.prallelism1.prallelism
1.prallelism
 
1.prallelism
1.prallelism1.prallelism
1.prallelism
 
Design and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptxDesign and Analysis of Algorithms.pptx
Design and Analysis of Algorithms.pptx
 
Lecture1
Lecture1Lecture1
Lecture1
 
Computer architecture short note (version 8)
Computer architecture short note (version 8)Computer architecture short note (version 8)
Computer architecture short note (version 8)
 
Testing
TestingTesting
Testing
 
SecondPresentationDesigning_Parallel_Programs.ppt
SecondPresentationDesigning_Parallel_Programs.pptSecondPresentationDesigning_Parallel_Programs.ppt
SecondPresentationDesigning_Parallel_Programs.ppt
 
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy A Study on Task Scheduling in Could Data Centers for Energy Efficacy
A Study on Task Scheduling in Could Data Centers for Energy Efficacy
 

More from Karthik Vivek

Peak detector, instrumentation amp
Peak detector, instrumentation ampPeak detector, instrumentation amp
Peak detector, instrumentation ampKarthik Vivek
 
U3 op amp applications
U3 op amp applicationsU3 op amp applications
U3 op amp applicationsKarthik Vivek
 
Fabrication of diodes, resistors, capacitors, fe ts
Fabrication of diodes, resistors, capacitors, fe tsFabrication of diodes, resistors, capacitors, fe ts
Fabrication of diodes, resistors, capacitors, fe tsKarthik Vivek
 
Embedded programming u3 part 1
Embedded programming u3 part 1Embedded programming u3 part 1
Embedded programming u3 part 1Karthik Vivek
 
ARM stacks, subroutines, Cortex M3, LPC 214X
ARM  stacks, subroutines, Cortex M3, LPC 214XARM  stacks, subroutines, Cortex M3, LPC 214X
ARM stacks, subroutines, Cortex M3, LPC 214XKarthik Vivek
 
ARM Versions, architecture
ARM Versions, architectureARM Versions, architecture
ARM Versions, architectureKarthik Vivek
 
unit 2- OP AMP APPLICATIONS
unit 2- OP AMP APPLICATIONSunit 2- OP AMP APPLICATIONS
unit 2- OP AMP APPLICATIONSKarthik Vivek
 
VLSI DESIGN- MOS TRANSISTOR
VLSI DESIGN- MOS TRANSISTORVLSI DESIGN- MOS TRANSISTOR
VLSI DESIGN- MOS TRANSISTORKarthik Vivek
 
Pipeline data path u3
Pipeline data path u3Pipeline data path u3
Pipeline data path u3Karthik Vivek
 

More from Karthik Vivek (20)

Peak detector, instrumentation amp
Peak detector, instrumentation ampPeak detector, instrumentation amp
Peak detector, instrumentation amp
 
U3 op amp applications
U3 op amp applicationsU3 op amp applications
U3 op amp applications
 
Unit 1 ic fab
Unit 1 ic fabUnit 1 ic fab
Unit 1 ic fab
 
Fabrication of diodes, resistors, capacitors, fe ts
Fabrication of diodes, resistors, capacitors, fe tsFabrication of diodes, resistors, capacitors, fe ts
Fabrication of diodes, resistors, capacitors, fe ts
 
Embedded programming u3 part 1
Embedded programming u3 part 1Embedded programming u3 part 1
Embedded programming u3 part 1
 
ARM stacks, subroutines, Cortex M3, LPC 214X
ARM  stacks, subroutines, Cortex M3, LPC 214XARM  stacks, subroutines, Cortex M3, LPC 214X
ARM stacks, subroutines, Cortex M3, LPC 214X
 
ARM inst set part 2
ARM inst set part 2ARM inst set part 2
ARM inst set part 2
 
ARM instruction set
ARM instruction  setARM instruction  set
ARM instruction set
 
ARM instruction set
ARM instruction  setARM instruction  set
ARM instruction set
 
ARM Versions, architecture
ARM Versions, architectureARM Versions, architecture
ARM Versions, architecture
 
Unit 1a train
Unit 1a trainUnit 1a train
Unit 1a train
 
Unit2 arm
Unit2 armUnit2 arm
Unit2 arm
 
Unit 1c
Unit 1cUnit 1c
Unit 1c
 
Unit 1b
Unit 1bUnit 1b
Unit 1b
 
Unit 1a train
Unit 1a trainUnit 1a train
Unit 1a train
 
Introduction
IntroductionIntroduction
Introduction
 
unit 2- OP AMP APPLICATIONS
unit 2- OP AMP APPLICATIONSunit 2- OP AMP APPLICATIONS
unit 2- OP AMP APPLICATIONS
 
VLSI DESIGN- MOS TRANSISTOR
VLSI DESIGN- MOS TRANSISTORVLSI DESIGN- MOS TRANSISTOR
VLSI DESIGN- MOS TRANSISTOR
 
Power Dissipation
Power DissipationPower Dissipation
Power Dissipation
 
Pipeline data path u3
Pipeline data path u3Pipeline data path u3
Pipeline data path u3
 

Recently uploaded

Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 

Recently uploaded (20)

Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 

Unit 3 part2

  • 2. Program design and analysis Program-level performance analysis. Optimizing for: Execution time. Energy/power. Program size. Program validation and testing.
  • 3. Program-level performance analysis  Need to understand performance in detail: Real-time behavior, not just typical. On complex platforms.  Program performance  CPU performance: Pipeline, cache are windows into program. We must analyze the entire program.
  • 4. Complexities of program performance Varies with input data: Different-length paths. Cache effects. Instruction-level performance variations: Pipeline interlocks. Fetch times.
  • 5. How to measure program performance Simulate execution of the CPU. Makes CPU state visible. Measure on real CPU using timer. Requires modifying the program to control the timer. Measure on real CPU using logic analyzer. Requires events visible on the pins.
  • 6. Program performance metrics Average-case execution time. Typically used in application programming. Worst-case execution time. A component in deadline satisfaction. Best-case execution time. Task-level interactions can cause best-case program behavior to result in worst-case system behavior.
  • 7. Elements of program performance Basic program execution time formula: execution time = program path + instruction timing Solving these problems independently helps simplify analysis. Easier to separate on simpler CPUs. Accurate performance analysis requires: Assembly/binary code. Execution platform.
  • 8. Data-dependent paths in an if statement if (a || b) { /* T1 */ if ( c ) /* T2 */ x = r*s+t; /* A1 */ else y=r+s; /* A2 */ z = r+s+u; /* A3 */ } else { if ( c ) /* T3 */ y = r-t; /* A4 */ } a b c path 0 0 0 T1=F, T3=F: no assignments 0 0 1 T1=F, T3=T: A4 0 1 0 T1=T, T2=F: A2, A3 0 1 1 T1=T, T2=T: A1, A3 1 0 0 T1=T, T2=F: A2, A3 1 0 1 T1=T, T2=T: A1, A3 1 1 0 T1=T, T2=F: A2, A3 1 1 1 T1=T, T2=T: A1, A3
  • 9. Paths in a loop for (i=0, f=0; i<N; i++) f = f + c[i] * x[i]; i=0 f=0 i=N f = f + c[i] * x[i] i = i + 1 N Y
  • 10. Instruction timing  Not all instructions take the same amount of time. Multi-cycle instructions. Fetches.  Execution times of instructions are not independent. Pipeline interlocks. Cache effects.  Execution times may vary with operand value. Floating-point operations.
  • 11. Mesaurement-driven performance analysis Program trace Simulator H/S Profiling-it counts the no. of times that procedures or basic blocks I the program are executed
  • 12. Physical measurement In-circuit emulator allows tracing. Affects execution timing. Logic analyzer can measure behavior at pins. Address bus can be analyzed to look for events. Code can be modified to make events visible. Particularly important for real-world input streams.
  • 13. CPU simulation Some simulators are less accurate. Cycle-accurate simulator provides accurate clock-cycle timing. Simulator models CPU internals. Simulator writer must know how CPU works.
  • 14. Performance optimization motivation Embedded systems must often meet deadlines. Faster may not be fast enough. Need to be able to analyze execution time. Worst-case, not typical. Need techniques for reliably improving execution time.
  • 15. Programs and performance analysis Best results come from analyzing optimized instructions, not high-level language code: non-obvious translations of HLL statements into instructions; code may move; cache effects are hard to predict.
  • 17. Loop optimizations Loops are good targets for optimization. Basic loop optimizations: code motion; induction-variable elimination; strength reduction (x*2 -> x<<1).
  • 18. Code motion BEFORE OPTIMIZATION for (i=0, f=0; i<N; i++) { x=y+z; a[i]=6*i; } AFTER OPTIMIZATION x=y+z; for (i=0, f=0; i<N; i++) { a[i]=6*I; }
  • 19. Induction variable elimination (nested loop) BEFORE OPTIMIZATION for (i=0; i<N; i++) { for (j=0; j<M; j++) { z[i] [j]= b[i] [j]; } } AFTER OPTIMIZATION for (i=0; i<N; i++) for (j=0; j<M; j++) { Zbinduct = i*M+j; *(Zptr+Zbinduct)=*(bptr+ Z binduct); }
  • 21. Cache analysis Loop nest: set of loops, one inside other. Perfect loop nest: no conditionals in nest. Because loops use large quantities of data, cache conflicts are common.
  • 22. Array conflicts in cache a[0,0] b[0,0] main memory cache 1024 4099 ... 1024 4099
  • 23. Array conflicts, cont’d. Array elements conflict because they are in the same line, even if not mapped to same location. Solutions: move one array; pad array.
  • 24. Performance optimization hints Use registers efficiently. Use page mode memory accesses. Analyze cache behavior: instruction conflicts can be handled by rewriting code, rescheudling; conflicting array data can be moved, padded.
  • 25. PROGRAM LEVEL ENERGY AND POWER ANALYSIS OPTIMIZATION
  • 26. Energy/power optimization Energy: ability to do work. Most important in battery-powered systems. Power: energy per unit time. Important even in wall-plug systems---power becomes heat.
  • 27. Measuring energy consumption Execute a small loop, measure current: while (TRUE) a(); I
  • 28. Cache behavior is important Energy consumption has a sweet spot as cache size changes: cache too small: program thrashes, burning energy on external memory accesses; cache too large: cache itself burns too much power.
  • 29. Optimizing for energy First-order optimization: high performance = low energy. Not many instructions trade speed for energy.
  • 30. Optimizing for energy, cont’d. Use registers efficiently. Identify and eliminate cache conflicts. Moderate loop unrolling eliminates some loop overhead instructions. Eliminate pipeline stalls. Inlining procedures may help: reduces linkage, but may increase cache thrashing.
  • 31. ANALYSIS AND OPTIMIZATION OF PROGRAM SIZE
  • 32. Optimizing for program size Goal: reduce hardware cost of memory; reduce power consumption of memory units. Two opportunities: data; instructions.
  • 33. Data size minimization Reuse constants, variables, data buffers in different parts of code. Requires careful verification of correctness. Encapsulating functions in subroutines when done carefully can reduce program size
  • 35. Program validation and testing Concentrate here on functional verification. Major testing strategies: Black box doesn’t look at the source code. Clear box (white box) does look at the source code.
  • 36. Clear-box testing Examine the source code to determine whether it works: Can you actually exercise a path? Do you get the value you expect along a path? Testing procedure: Controllability: rovide program with inputs. Execute. Observability: examine outputs.
  • 37. Execution paths and testing Paths are important in functional testing as well as performance analysis. In general, an exponential number of paths through the program. Show that some paths dominate others. Heuristically limit paths.
  • 38. Choosing the paths to test Possible criteria: Execute every statement at least once. Execute every branch direction at least once. Equivalent for structured programs. Not true for gotos. not covered
  • 39. Basis paths Approximate CDFG with undirected graph. Undirected graphs have basis paths: All paths are linear combinations of basis paths.
  • 40. Cyclomatic complexity Cyclomatic complexity is a bound on the size of basis sets: e = # edges n = # nodes p = number of graph components M = e – n + 2p.
  • 41. Branch testing Heuristic for testing branches. Exercise true and false branches of conditional. Exercise every simple condition at least once.
  • 42. Domain testing Heuristic test for linear inequalities. Test on each side + boundary of inequality.
  • 43. Def-use pairs Variable def-use: Def when value is assigned (defined). Exercise each def-use pair.
  • 44. Loop testing Loops need specialized tests to be tested efficiently. Heuristic testing strategy: Skip loop entirely. One loop iteration. Two loop iterations. # iterations much below max. n-1, n, n+1 iterations where n is max.
  • 45. Black-box testing Complements clear-box testing. May require a large number of tests. Tests software in different ways.
  • 46. Black-box test vectors Random tests. May weight distribution based on software specification. Regression tests. Tests of previous versions, bugs, etc. May be clear-box tests of previous versions.
  • 47. What atlast?  Must have the knowledge of compiler opt.  Well worse in Embedded C  To interpret the data structures whenever needed  Must know the DFG and CDFG  Wide knowledge on processor (target)  Techniques in prog optimization  Develop the code which occupies less memory space with much features  Smart usage of loops in the code