SlideShare a Scribd company logo
1 of 26
Code-scheduling
constraints
By
ARCHANA.M
II-MSC(CS)
DEPERTMENT OF CS &IT
NADAR SARASWATHI COLLEGE OF ARTS AND SCIENCE
This includes:
Introduction
Data Dependence
Finding Dependence Among
Memory Access
Tradeoff Between Register Usage
and Parallelism
Phase Ordering Between Register
Allocation and Code Scheduling
Control Dependence
Speculative Execution Support
A Basic Machine Model
Introduction:
Code scheduling is a form of program optimization that applies
to the machine code that is produced by the code generator.
Code scheduling is subject to three kinds of constraints:
Control-dependence constraints
Data-dependence constraints
Resource constraints
Cont.,
These scheduling constraints guarantee that the optimized program pro-
duces the same results as the original.
However, because code scheduling changes the order in which the
operations execute, the state of the memory at any one point may not
match any of the memory states in a sequential execution.
This situation is a problem if a program's execution is interrupted by, for
example, a thrown exception or a user-inserted breakpoint. Optimized
programs are therefore harder to debug.
Data Dependence:
It is easy to see that if we change the execution order of two operations
that do not touch any of the same variables, we cannot possibly affect their
results.
In fact, even if these two operations read the same variable, we can still
permute their execution. Only if an operation writes to a variable read or
written by another can changing their execution order alter their results.
Such pairs of operations are said to share a data dependence, and their
relative execution order must be preserved
Cont.,
There are three flavors of data dependence:
True dependence
Anti-dependence
Output dependence
Cont.,
True dependence: read after write. If a write is followed by a read of the
same location, the read depends on the value written; such a dependence is
known as a true dependence.
Anti-dependence: write after read. If a read is followed by a write to the
same location, we say that there is an antidependence from the read to the
write. The write does not depend on the read per se, but if the write
happens before the read, then the read operation will pick up the wrong
value
Cont.,
Output dependence: write after write. Two writes to the same location
share an output dependence. If the dependence is violated, the value of the
memory location written will have the wrong value after both operations
are performed.
Anti-dependence and output dependences are referred to as storage-related
dependences. These are not "true" dependences and can be eliminated by
using different locations to store different values
Finding Dependence Among Memory
Access:
To check if two memory accesses share a data dependence, we only need
to tell if they can refer to the same location; we do not need to know which
location is being accessed.
Data-dependence analysis is highly sensitive to the programming
language used in the program.
A correct discovery of data dependences requires a number of different
forms of analysis.
Array Data-Dependence Analysis:
Array data dependence is the problem of disambiguating between the
values of indexes in array-element accesses. For example, the loop
for (i = 0; i < n; i++)
A[2*i]= A[2*i+1];
copies odd elements in the array A to the even elements just preceding
them. Because all the read and written locations in the loop are distinct
from each other, there are no dependences between the accesses, and all
the iterations in the loop can execute in parallel.
Pointer - Alias Analysis:
We say that two pointers are aliased if they can refer to the same object.
Pointer-alias analysis is difficult because there are many potentially
aliased pointers in a program, and they can each point to an unbounded
number of dynamic objects over time.
To get any precision, pointer-alias analysis must be applied across all the
functions in a program.
Inter-procedural Analysis:
For languages that pass parameters by reference, interprocedural analysis
is needed to determine if the same variable is passed as two or more
different arguments.
Such aliases can create dependences between seemingly distinct
parameters.
Similarly, global variables can be used as parameters and thus create
dependences between parameter accesses and global variable accesses.
Tradeoff Between Register Usage and
Parallelism:
machine-independent intermediate representation of the source
program uses an unbounded numb er of pseudo registers to
represent variables that can be allocated to registers.
These variables include scalar variables in the source program that
cannot b e referred to by any other names, as well as temporary
variables that are generated by the compiler to hold the partial
results in expressions.
Unlike memory lo cations, registers are uniquely named. Thus
precise data-dependence constraints can be generated for register
accesses easily.
Cont.,
The unbounded numb er of pseudo registers used in the
intermediate representation must eventually be mapped to the
small number of physical registers available on the target machine.
Mapping several pseudo registers to the same physical register
has the unfortunate side effect of creating artificial storage
dependences that constrain instruction-level parallelism.
Conversely, executing instructions in parallel creates the need for
more storage to hold the values being computed simultaneously.
Phase Ordering Between Register
Allocation and Code Scheduling:
If registers are allocated before scheduling, the resulting code
tends to have many storage dependences that limit code
scheduling.
On the other hand, if code is scheduled before register
allocation, the schedule created may require so many registers
that register spilling may negate the advantages of instruction-
level parallelism
Control Dependence:
Scheduling operations within a basic block is relatively easy because all
the instructions are guaranteed to execute once control flow reaches the
beginning of the block.
Instructions in a basic block can be reordered arbitrarily, as long as all the
data dependences are satisfied. Unfortunately, basic blocks, especially in
nonnumeric programs, are typically very small; on average, there are only
about five instructions in a basic block.
In addition, operations in the same block are often highly related and thus
have little parallelism. Exploiting parallelism across basic blocks is
therefore crucial.
Cont.,
An optimized program must execute all the operations in the original pro-
gram. It can execute more instructions than the original, as long as the
extra instructions do not change what the program does.
Why would executing extra instructions speed up a program's execution?
If we know that an instruction is likely to be executed, and an idle
resource is available to perform the operation "for free," we can execute
the instruction speculatively.
The program runs faster when the speculation turns out to be correct.
Speculative Execution Support:
Memory loads are one type of instruction that can benefit greatly from
speculative execution. Memory loads are quite common, of course. They
have relatively long execution latencies, addresses used in the loads are
commonly available in advance, and the result can be stored in a new
temporary variable without destroying the value of any other variable.
Unfortunately, memory loads can raise exceptions if their addresses are
illegal, so speculatively accessing illegal addresses may cause a correct
program to halt unexpectedly.
Besides, mis-predicted memory loads can cause extra cache misses and
page faults, which are extremely costly.
Cont.,
Prefetching :
The prefetch instruction was invented to bring data from memory to
the cache before it is used.
A prefetch instruction indicates to the processor that the program is
likely to use a particular memory word in the near future.
If the location specified is invalid or if accessing it causes a page fault,
the processor can simply ignore the operation. Otherwise, the
processor will bring the data from memory to the cache if it is not
already there.
Cont.,
Poison Bits:
Another architectural feature called poison bits was invented to allow
speculative load of data from memory into the register file.
Each register on the machine is augmented with a poison bit. If illegal
memory is accessed or the accessed page is not in memory, the
processor does not raise the exception immediately but instead just sets
the poison bit of the destination register.
An exception is raised only if the contents of the register with a
marked poison bit are used.
Cont.,
Predicated Execution:
Because branches are expensive, and mispredicted branches are even
more so predicated instructions were invented to reduce the number of
branches in a program.
A predicated instruction is like a normal instruction but has an extra
predicate operand to guard its execution; the instruction is executed
only if the predicate is found to be true.
A Basic Machine Model:
Many machines can be represented using the following simple model. A
machine M =<R,T>, consists of:
1. A set of operation types T, such as loads, stores, arithmetic
operations, and so on
2. A vector R = [n, r ,... ] representing hardware resources, where r«
is the number of units available of the ith kind of resource. Examples of
typical resource types include: memory access units, ALU's, and
floating-point functional units
Cont.,
Each operation has a set of input operands, a set of output operands, and a
resource requirement.
Associated with each input operand is an input latency indicating when
the input value must be available.
Typical input operands have zero latency, meaning that the values are
needed immediately, at the clock when the operation is issued.
Similarly, associated with each output operand is an output latency, which
indicates when the result is available, relative to the start of the operation.
Cont.,
Typical machine operations occupy only one unit of resource at the time
an operation is issued.
Some operations may use more than one functional unit.
For example, a multiply-and-add operation may use a multiplier in the
first clock and an adder in the second.
Some operations, such as a divide, may need to occupy a resource for
several clocks.
Cont.,
Fully pipelined operations are those that can be issued every clock, even
though their results are not available until some number of clocks later.
We need not model the resources of every stage of a pipeline explicitly;
one single unit to represent the first stage will do.
Any operation occupying the first stage of a pipeline is guaranteed the
right to proceed to subsequent stages in subsequent clocks
Code scheduling constraints

More Related Content

What's hot

Fifth normal form
Fifth normal formFifth normal form
Fifth normal formAthi Sethu
 
Comparing Features of Real Time OS and Distributed.pptx
Comparing Features of Real Time OS and Distributed.pptxComparing Features of Real Time OS and Distributed.pptx
Comparing Features of Real Time OS and Distributed.pptx42MOHDASIL
 
Distributed Database Management System
Distributed Database Management SystemDistributed Database Management System
Distributed Database Management SystemAAKANKSHA JAIN
 
Pipeline hazard
Pipeline hazardPipeline hazard
Pipeline hazardAJAL A J
 
Kernel. Operating System
Kernel. Operating SystemKernel. Operating System
Kernel. Operating Systempratikkadam78
 
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...Instruction Level Parallelism Compiler optimization Techniques Anna Universit...
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...Dr.K. Thirunadana Sikamani
 
Dynamic storage allocation techniques in Compiler design
Dynamic storage allocation techniques in Compiler designDynamic storage allocation techniques in Compiler design
Dynamic storage allocation techniques in Compiler designkunjan shah
 
Distributed data processing
Distributed data processingDistributed data processing
Distributed data processingAyisha Kowsar
 
program flow mechanisms, advanced computer architecture
program flow mechanisms, advanced computer architectureprogram flow mechanisms, advanced computer architecture
program flow mechanisms, advanced computer architecturePankaj Kumar Jain
 
Free space managment46
Free space managment46Free space managment46
Free space managment46myrajendra
 
Congetion Control.pptx
Congetion Control.pptxCongetion Control.pptx
Congetion Control.pptxNaveen Dubey
 
Operating Systems: Process Scheduling
Operating Systems: Process SchedulingOperating Systems: Process Scheduling
Operating Systems: Process SchedulingDamian T. Gordon
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system modelHarshad Umredkar
 

What's hot (20)

DBMS - RAID
DBMS - RAIDDBMS - RAID
DBMS - RAID
 
Compiler design
Compiler designCompiler design
Compiler design
 
Fifth normal form
Fifth normal formFifth normal form
Fifth normal form
 
Threads .ppt
Threads .pptThreads .ppt
Threads .ppt
 
Comparing Features of Real Time OS and Distributed.pptx
Comparing Features of Real Time OS and Distributed.pptxComparing Features of Real Time OS and Distributed.pptx
Comparing Features of Real Time OS and Distributed.pptx
 
Distributed Database Management System
Distributed Database Management SystemDistributed Database Management System
Distributed Database Management System
 
Pipeline hazard
Pipeline hazardPipeline hazard
Pipeline hazard
 
Kernel. Operating System
Kernel. Operating SystemKernel. Operating System
Kernel. Operating System
 
Join operation
Join operationJoin operation
Join operation
 
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...Instruction Level Parallelism Compiler optimization Techniques Anna Universit...
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...
 
Dynamic storage allocation techniques in Compiler design
Dynamic storage allocation techniques in Compiler designDynamic storage allocation techniques in Compiler design
Dynamic storage allocation techniques in Compiler design
 
Distributed database
Distributed databaseDistributed database
Distributed database
 
Distributed data processing
Distributed data processingDistributed data processing
Distributed data processing
 
Distributed Operating System_1
Distributed Operating System_1Distributed Operating System_1
Distributed Operating System_1
 
program flow mechanisms, advanced computer architecture
program flow mechanisms, advanced computer architectureprogram flow mechanisms, advanced computer architecture
program flow mechanisms, advanced computer architecture
 
11. dfs
11. dfs11. dfs
11. dfs
 
Free space managment46
Free space managment46Free space managment46
Free space managment46
 
Congetion Control.pptx
Congetion Control.pptxCongetion Control.pptx
Congetion Control.pptx
 
Operating Systems: Process Scheduling
Operating Systems: Process SchedulingOperating Systems: Process Scheduling
Operating Systems: Process Scheduling
 
distributed Computing system model
distributed Computing system modeldistributed Computing system model
distributed Computing system model
 

Similar to Code scheduling constraints

A novel algorithm to protect and manage memory locations
A novel algorithm to protect and manage memory locationsA novel algorithm to protect and manage memory locations
A novel algorithm to protect and manage memory locationsiosrjce
 
System Structure for Dependable Software Systems
System Structure for Dependable Software SystemsSystem Structure for Dependable Software Systems
System Structure for Dependable Software SystemsVincenzo De Florio
 
advanced computer architesture-conditions of parallelism
advanced computer architesture-conditions of parallelismadvanced computer architesture-conditions of parallelism
advanced computer architesture-conditions of parallelismPankaj Kumar Jain
 
LOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONSLOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONSijdpsjournal
 
Lock free parallel access collections
Lock free parallel access collectionsLock free parallel access collections
Lock free parallel access collectionsijdpsjournal
 
Software Reverse Engineering in a Security Context
Software Reverse Engineering in a Security ContextSoftware Reverse Engineering in a Security Context
Software Reverse Engineering in a Security ContextLokendra Rawat
 
Automatically inferring structure correlated variable set for concurrent atom...
Automatically inferring structure correlated variable set for concurrent atom...Automatically inferring structure correlated variable set for concurrent atom...
Automatically inferring structure correlated variable set for concurrent atom...ijseajournal
 
computer architecture and organization.pptx
computer architecture and organization.pptxcomputer architecture and organization.pptx
computer architecture and organization.pptxROHANSharma311906
 
Complier design
Complier design Complier design
Complier design shreeuva
 
Multithreading 101
Multithreading 101Multithreading 101
Multithreading 101Tim Penhey
 

Similar to Code scheduling constraints (20)

Compiler Design
Compiler DesignCompiler Design
Compiler Design
 
M017318288
M017318288M017318288
M017318288
 
A novel algorithm to protect and manage memory locations
A novel algorithm to protect and manage memory locationsA novel algorithm to protect and manage memory locations
A novel algorithm to protect and manage memory locations
 
C question
C questionC question
C question
 
System Structure for Dependable Software Systems
System Structure for Dependable Software SystemsSystem Structure for Dependable Software Systems
System Structure for Dependable Software Systems
 
advanced computer architesture-conditions of parallelism
advanced computer architesture-conditions of parallelismadvanced computer architesture-conditions of parallelism
advanced computer architesture-conditions of parallelism
 
Concurrency and parallel in .net
Concurrency and parallel in .netConcurrency and parallel in .net
Concurrency and parallel in .net
 
Parallel programming model
Parallel programming modelParallel programming model
Parallel programming model
 
LOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONSLOCK-FREE PARALLEL ACCESS COLLECTIONS
LOCK-FREE PARALLEL ACCESS COLLECTIONS
 
Lock free parallel access collections
Lock free parallel access collectionsLock free parallel access collections
Lock free parallel access collections
 
Software Reverse Engineering in a Security Context
Software Reverse Engineering in a Security ContextSoftware Reverse Engineering in a Security Context
Software Reverse Engineering in a Security Context
 
Automatically inferring structure correlated variable set for concurrent atom...
Automatically inferring structure correlated variable set for concurrent atom...Automatically inferring structure correlated variable set for concurrent atom...
Automatically inferring structure correlated variable set for concurrent atom...
 
Computer architecture
Computer architectureComputer architecture
Computer architecture
 
Oversimplified CA
Oversimplified CAOversimplified CA
Oversimplified CA
 
1844 1849
1844 18491844 1849
1844 1849
 
1844 1849
1844 18491844 1849
1844 1849
 
Reactive programming intro
Reactive programming introReactive programming intro
Reactive programming intro
 
computer architecture and organization.pptx
computer architecture and organization.pptxcomputer architecture and organization.pptx
computer architecture and organization.pptx
 
Complier design
Complier design Complier design
Complier design
 
Multithreading 101
Multithreading 101Multithreading 101
Multithreading 101
 

More from ArchanaMani2

Software evolution and Verification,validation
Software evolution and Verification,validationSoftware evolution and Verification,validation
Software evolution and Verification,validationArchanaMani2
 
Ajax enabled rich internet applications with xml and json
Ajax enabled rich internet applications with xml and jsonAjax enabled rich internet applications with xml and json
Ajax enabled rich internet applications with xml and jsonArchanaMani2
 
Excellence in visulization
Excellence in visulizationExcellence in visulization
Excellence in visulizationArchanaMani2
 
Transaction management
Transaction managementTransaction management
Transaction managementArchanaMani2
 
Topological Sort and BFS
Topological Sort and BFSTopological Sort and BFS
Topological Sort and BFSArchanaMani2
 
Inheritance and overriding
Inheritance  and overridingInheritance  and overriding
Inheritance and overridingArchanaMani2
 

More from ArchanaMani2 (10)

Software evolution and Verification,validation
Software evolution and Verification,validationSoftware evolution and Verification,validation
Software evolution and Verification,validation
 
Ajax enabled rich internet applications with xml and json
Ajax enabled rich internet applications with xml and jsonAjax enabled rich internet applications with xml and json
Ajax enabled rich internet applications with xml and json
 
Excellence in visulization
Excellence in visulizationExcellence in visulization
Excellence in visulization
 
Firewall
FirewallFirewall
Firewall
 
The linux system
The linux systemThe linux system
The linux system
 
Big data
Big dataBig data
Big data
 
Transaction management
Transaction managementTransaction management
Transaction management
 
Topological Sort and BFS
Topological Sort and BFSTopological Sort and BFS
Topological Sort and BFS
 
Genetic algorithm
Genetic algorithmGenetic algorithm
Genetic algorithm
 
Inheritance and overriding
Inheritance  and overridingInheritance  and overriding
Inheritance and overriding
 

Recently uploaded

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersSabitha Banu
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...jaredbarbolino94
 

Recently uploaded (20)

Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 
DATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginnersDATA STRUCTURE AND ALGORITHM for beginners
DATA STRUCTURE AND ALGORITHM for beginners
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...Historical philosophical, theoretical, and legal foundations of special and i...
Historical philosophical, theoretical, and legal foundations of special and i...
 

Code scheduling constraints

  • 1. Code-scheduling constraints By ARCHANA.M II-MSC(CS) DEPERTMENT OF CS &IT NADAR SARASWATHI COLLEGE OF ARTS AND SCIENCE
  • 2. This includes: Introduction Data Dependence Finding Dependence Among Memory Access Tradeoff Between Register Usage and Parallelism Phase Ordering Between Register Allocation and Code Scheduling Control Dependence Speculative Execution Support A Basic Machine Model
  • 3. Introduction: Code scheduling is a form of program optimization that applies to the machine code that is produced by the code generator. Code scheduling is subject to three kinds of constraints: Control-dependence constraints Data-dependence constraints Resource constraints
  • 4. Cont., These scheduling constraints guarantee that the optimized program pro- duces the same results as the original. However, because code scheduling changes the order in which the operations execute, the state of the memory at any one point may not match any of the memory states in a sequential execution. This situation is a problem if a program's execution is interrupted by, for example, a thrown exception or a user-inserted breakpoint. Optimized programs are therefore harder to debug.
  • 5. Data Dependence: It is easy to see that if we change the execution order of two operations that do not touch any of the same variables, we cannot possibly affect their results. In fact, even if these two operations read the same variable, we can still permute their execution. Only if an operation writes to a variable read or written by another can changing their execution order alter their results. Such pairs of operations are said to share a data dependence, and their relative execution order must be preserved
  • 6. Cont., There are three flavors of data dependence: True dependence Anti-dependence Output dependence
  • 7. Cont., True dependence: read after write. If a write is followed by a read of the same location, the read depends on the value written; such a dependence is known as a true dependence. Anti-dependence: write after read. If a read is followed by a write to the same location, we say that there is an antidependence from the read to the write. The write does not depend on the read per se, but if the write happens before the read, then the read operation will pick up the wrong value
  • 8. Cont., Output dependence: write after write. Two writes to the same location share an output dependence. If the dependence is violated, the value of the memory location written will have the wrong value after both operations are performed. Anti-dependence and output dependences are referred to as storage-related dependences. These are not "true" dependences and can be eliminated by using different locations to store different values
  • 9. Finding Dependence Among Memory Access: To check if two memory accesses share a data dependence, we only need to tell if they can refer to the same location; we do not need to know which location is being accessed. Data-dependence analysis is highly sensitive to the programming language used in the program. A correct discovery of data dependences requires a number of different forms of analysis.
  • 10. Array Data-Dependence Analysis: Array data dependence is the problem of disambiguating between the values of indexes in array-element accesses. For example, the loop for (i = 0; i < n; i++) A[2*i]= A[2*i+1]; copies odd elements in the array A to the even elements just preceding them. Because all the read and written locations in the loop are distinct from each other, there are no dependences between the accesses, and all the iterations in the loop can execute in parallel.
  • 11. Pointer - Alias Analysis: We say that two pointers are aliased if they can refer to the same object. Pointer-alias analysis is difficult because there are many potentially aliased pointers in a program, and they can each point to an unbounded number of dynamic objects over time. To get any precision, pointer-alias analysis must be applied across all the functions in a program.
  • 12. Inter-procedural Analysis: For languages that pass parameters by reference, interprocedural analysis is needed to determine if the same variable is passed as two or more different arguments. Such aliases can create dependences between seemingly distinct parameters. Similarly, global variables can be used as parameters and thus create dependences between parameter accesses and global variable accesses.
  • 13. Tradeoff Between Register Usage and Parallelism: machine-independent intermediate representation of the source program uses an unbounded numb er of pseudo registers to represent variables that can be allocated to registers. These variables include scalar variables in the source program that cannot b e referred to by any other names, as well as temporary variables that are generated by the compiler to hold the partial results in expressions. Unlike memory lo cations, registers are uniquely named. Thus precise data-dependence constraints can be generated for register accesses easily.
  • 14. Cont., The unbounded numb er of pseudo registers used in the intermediate representation must eventually be mapped to the small number of physical registers available on the target machine. Mapping several pseudo registers to the same physical register has the unfortunate side effect of creating artificial storage dependences that constrain instruction-level parallelism. Conversely, executing instructions in parallel creates the need for more storage to hold the values being computed simultaneously.
  • 15. Phase Ordering Between Register Allocation and Code Scheduling: If registers are allocated before scheduling, the resulting code tends to have many storage dependences that limit code scheduling. On the other hand, if code is scheduled before register allocation, the schedule created may require so many registers that register spilling may negate the advantages of instruction- level parallelism
  • 16. Control Dependence: Scheduling operations within a basic block is relatively easy because all the instructions are guaranteed to execute once control flow reaches the beginning of the block. Instructions in a basic block can be reordered arbitrarily, as long as all the data dependences are satisfied. Unfortunately, basic blocks, especially in nonnumeric programs, are typically very small; on average, there are only about five instructions in a basic block. In addition, operations in the same block are often highly related and thus have little parallelism. Exploiting parallelism across basic blocks is therefore crucial.
  • 17. Cont., An optimized program must execute all the operations in the original pro- gram. It can execute more instructions than the original, as long as the extra instructions do not change what the program does. Why would executing extra instructions speed up a program's execution? If we know that an instruction is likely to be executed, and an idle resource is available to perform the operation "for free," we can execute the instruction speculatively. The program runs faster when the speculation turns out to be correct.
  • 18. Speculative Execution Support: Memory loads are one type of instruction that can benefit greatly from speculative execution. Memory loads are quite common, of course. They have relatively long execution latencies, addresses used in the loads are commonly available in advance, and the result can be stored in a new temporary variable without destroying the value of any other variable. Unfortunately, memory loads can raise exceptions if their addresses are illegal, so speculatively accessing illegal addresses may cause a correct program to halt unexpectedly. Besides, mis-predicted memory loads can cause extra cache misses and page faults, which are extremely costly.
  • 19. Cont., Prefetching : The prefetch instruction was invented to bring data from memory to the cache before it is used. A prefetch instruction indicates to the processor that the program is likely to use a particular memory word in the near future. If the location specified is invalid or if accessing it causes a page fault, the processor can simply ignore the operation. Otherwise, the processor will bring the data from memory to the cache if it is not already there.
  • 20. Cont., Poison Bits: Another architectural feature called poison bits was invented to allow speculative load of data from memory into the register file. Each register on the machine is augmented with a poison bit. If illegal memory is accessed or the accessed page is not in memory, the processor does not raise the exception immediately but instead just sets the poison bit of the destination register. An exception is raised only if the contents of the register with a marked poison bit are used.
  • 21. Cont., Predicated Execution: Because branches are expensive, and mispredicted branches are even more so predicated instructions were invented to reduce the number of branches in a program. A predicated instruction is like a normal instruction but has an extra predicate operand to guard its execution; the instruction is executed only if the predicate is found to be true.
  • 22. A Basic Machine Model: Many machines can be represented using the following simple model. A machine M =<R,T>, consists of: 1. A set of operation types T, such as loads, stores, arithmetic operations, and so on 2. A vector R = [n, r ,... ] representing hardware resources, where r« is the number of units available of the ith kind of resource. Examples of typical resource types include: memory access units, ALU's, and floating-point functional units
  • 23. Cont., Each operation has a set of input operands, a set of output operands, and a resource requirement. Associated with each input operand is an input latency indicating when the input value must be available. Typical input operands have zero latency, meaning that the values are needed immediately, at the clock when the operation is issued. Similarly, associated with each output operand is an output latency, which indicates when the result is available, relative to the start of the operation.
  • 24. Cont., Typical machine operations occupy only one unit of resource at the time an operation is issued. Some operations may use more than one functional unit. For example, a multiply-and-add operation may use a multiplier in the first clock and an adder in the second. Some operations, such as a divide, may need to occupy a resource for several clocks.
  • 25. Cont., Fully pipelined operations are those that can be issued every clock, even though their results are not available until some number of clocks later. We need not model the resources of every stage of a pipeline explicitly; one single unit to represent the first stage will do. Any operation occupying the first stage of a pipeline is guaranteed the right to proceed to subsequent stages in subsequent clocks