Your SlideShare is downloading. ×
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
1
VLSI –PHYSICAL DESIGN
INTRODUCTION:
The transformation of a circuit descr...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
2
(i). System specification: The specifications of the system to be designe...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
3
which conforms to the functional design. This logic design of the system ...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
4
packaged in a dual in-line package (DIP) or pin grid array (PGA). Chips w...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
5
the blocks. Placement is typically done in two phases. In the first phase...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
6
Different approaches are followed to the floor planning problem. Wimer et...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
7
the interconnections. Another more often used heuristic solution method f...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
8
(both in x-dimension and y-dimension) and topological algorithm (moving o...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
9
surface without restriction. In other words, this style is characterized ...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
10
additional metal layers. If all the routing can be done on top of the tr...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
11
However, as more metal layers become available for routing, the differen...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
12
Conclusion: The choice of design style for a particular circuit depends ...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
13
while (temp > FINAL-TEMP) do
while (inner_loop_criterion = FALSE) do
new...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
14
process used for shuffling a configuration (PERTURB), acceptance probabi...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
15
If the problem is small enough, then solve it directly.
Otherwise the pr...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
16
input. Here the important consideration is the state transitions and the...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
17
inputs.The outputs of a gate are represented by either truth table repre...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
18
Compiler Driven Simulation:
There are two basic mechanisms to simulate a...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
19
Gate level simulators are not completely suitable for the logic simulati...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
20
dynamically at every iteration where only conducting transistors are inc...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
21
Shannon’s Expansion Theorem :
The Shannon’s expansion theorem is used it...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
22
This Boolean function is denoted by the Binary Decision Diagram as shown...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
23
(i)-Bubble –no bubble (ii) Dashed line-solid line (iii) T(then) vs E(els...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
24
Ordered Binary Decision Diagram (OBDD)
A ,Binary Decision Diagram is sai...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
25
(ii).Remove redundant tests.
(iii).Remove duplicate non-terminal
(iv).Re...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
26
A BDD is reduced if it has been simplified as much as possible using the...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
27
HIGH-LEVEL SYNTHESIS
INTRODUCTION:
The basic objective of high-level syn...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
28
Verification time, is reduced a lot because the HLS tool can, in additio...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
29
Busses: a connection shared between several hardware elements, such that...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
30
The goal of allocation is to make appropriate trade-offs between the des...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
31
Scheduling Algorithms: The scheduling is very important because it deter...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
32
ALAP( GS(V,E), λ ) { ; Perform ALAP scheduling on the sequencing
graph, ...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
33
can program the programmable logic module using an FPGA programming tool...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
34
The number of inputs is eight. The circuit within the logic block can be...
Dr.Y.NARASIMHA MURTHY Ph.D
yayavaram@yahoo.com
35
Routing: In this phase, all the sub-circuits which have been programmed ...
Upcoming SlideShare
Loading in...5
×

Vlsi physical design-notes

3,999

Published on

This is a class notes meant for M.Tech VLSI students of SKUCET and others

Published in: Technology, Business
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,999
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
341
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Transcript of "Vlsi physical design-notes"

  1. 1. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 1 VLSI –PHYSICAL DESIGN INTRODUCTION: The transformation of a circuit description into a geometric description, is known as a layout. A layout consists of a set of planar geometric shapes in several layers. The process of converting the specifications of an electrical circuit into a layout is called the Physical design. Due to the large number of components and the fine details required by the fabrication process, the physical design is not practically possible without the help of computers. As a result, almost all phases of physical design extensively use computer-aided design (CAD) tools and many phases are either partially or fully automated. This automation of the physical design process has increased the level of integration, reduced the turnaround time, and enhanced chip performance. There are various CAD tools available in market and each of them have their own strengths and weaknesses. The Electronic Design Automation (EDA) companies like Cadence, Synopsys, Magma, and Mentor Graphics provide these CAD tools. VLSI physical design automation is mainly deals with the study of algorithms related to the physical design process. The objective is to study optimal arrangements of devices on a plane (or in a three-dimensional space) and various interconnection schemes between these devices to obtain the desired functionality. Because space on a wafer is very expensive, algorithms must use the space very efficiently to decrease the costs and improve the yield. In addition, the arrangement of devices (placement) plays a key role in determining the performance of a chip. Algorithms for physical design must also ensure that all the rules required by the fabrication are followed and that the layout is within the tolerance limits of the fabrication process. Finally, algorithms must be efficient and should be able to handle very large designs. Efficient algorithms not only lead to fast turnaround time, but also permit designers to iteratively improve the layouts. VLSI DESIGN CYCLE: The design process of producing a packaged VLSI chip physically follows various steps which is popularly known as VLSI design cycle.This design cycle is normally represented by a flow chart shown below. The various steps involved in the design cycle are elaborated below.
  2. 2. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 2 (i). System specification: The specifications of the system to be designed are exactly specified in this step. It considers performance, functionality, and the physical dimensions of the design. The choice of fabrication technology and design techniques are also considered. The end results are specifications for the size, speed, power, and functionality of the VLSI system to be designed. (ii) Functional design: In this step, behavioral aspects of the system are considered. The outcome is usually a timing diagram or other relationships between sub-units. This information is used to improve the overall design process and to reduce the complexity of the subsequent phases. (iii). Logic design: In this step, the functional design is converted into a logical design, using the Boolean expressions. These expressions are minimized to achieve the smallest logic design
  3. 3. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 3 which conforms to the functional design. This logic design of the system is simulated and tested to verify its correctness. (iv).Circuit design: This step involves conversion of Boolean expressions into a circuit representation by taking into consideration the speed and power requirements of the original design. The electrical behavior of the various components are also considered in this phase. The circuit design is usually expressed in a detailed circuit diagram. (v).Physical design: In this step, the circuit representation of each component is converted into a geometric representation. This representation is a set of geometric patterns which perform the intended logic function of the corresponding component. Connections between different components are also expressed as geometric patterns. (This geometric representation of a circuit is called a layout). The exact details of the layout also depend on design rules, which are guidelines based on the limitations of the fabrication process and the electrical properties of the fabrication materials. Physical design is a very complex process, therefore, it is usually broken down into various sub-steps in order to handle the complexity of the problem. (vi). Design verification: In this step, the layout is verified to ensure that the layout meets the system specifications and the fabrication requirements. Design verification consists of design rule checking (DRC) and circuit extraction. DRC is a process which verifies that all geometric patterns meet the design rules imposed by the fabrication process. After checking the layout for design rule violations and removing them, the functionality of the layout is verified by circuit extraction. This is a reverse engineering process and generates the circuit representation from the layout. This reverse engineered circuit representation can then be compared to the original circuit representation to verify the correctness of the layout. (vii). Fabrication: This step is followed after the design verification. The fabrication process consists of several steps like, preparation of wafer, deposition, and diffusion of various materials on the wafer according to the layout description. A typical wafer is 10 cm in diameter and can be used to produce between 12 and 30 chips. Before the chip is mass produced, a prototype is made and tested. (viii). Packaging, testing, and debugging : In this step, the chip is fabricated and diced in a fabrication facility. Each chip is then packaged and tested to ensure that it meets all the design specifications and that it functions properly. Chips used in printed circuit boards (PCBs) are
  4. 4. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 4 packaged in a dual in-line package (DIP) or pin grid array (PGA). Chips which are to be used in a multichip module (MCM) are not packaged because MCMs use bare or naked chips. PHYSICAL DESIGN CYCLE : The Physical design cycle converts a circuit diagram into a layout. This complex task is completed in several steps ,like s partitioning, floor-planning, placement, routing, and lay-out compaction etc. The details of these steps are given below. (a).Partitioning : The chip layout is always a complex task and hence it is divided into several smaller tasks. A chip may contain several million transistors. Layout of the entire circuit cannot be handled due to the limitation of memory space as well as computation power available. Therefore, it is normally partitioned by grouping the components into blocks. The actual partitioning process considers many factors such as size of the blocks, number of blocks, and number of interconnections between the blocks. The output of partitioning is a set of blocks along with the interconnections required between blocks. The set of interconnections required is referred to as a net list. In large circuits the partitioning process is hierarchical and at the topmost level a chip may have between 5 and 25 blocks. Each module is then partitioned recursively into smaller blocks. A disadvantage of the partitioning process is that it may degrade the performance of the final design. During partitioning, critical components should be assigned to the same partition. If such an assignment is not possible, then appropriate timing constraints must be generated to keep the two critical components close together. Usually, several components, forming a critical path, determine the chip performance. If each component is assigned to a different partition, the critical path may be too long. Minimizing the length of critical paths improves system performance After a chip has been partitioned, each of the sub-circuits must be placed on a fixed plane and the nets between all the partitions must be interconnected. The placement of the sub-circuits is done by the placement algorithms and the nets are routed by using routing algorithms. (b) Placement: It is the process of arranging a set of modules on the layout surface. Each module has fixed shape and fixed terminal locations. A poor placement uses larger area and hence results in performance degradation. The placement process determines the exact positions of the blocks on the chip, so as to find a minimum area arrangement for the blocks that allows completion of interconnections between
  5. 5. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 5 the blocks. Placement is typically done in two phases. In the first phase an initial placement is created. In the second phase the initial placement is evaluated and iterative improvements are made until the layout has minimum area and conforms to design specifications. It is important to note that some space between the blocks is intentionally left empty to allow interconnections between blocks. Placement may lead to un-routable design, i.e., routing may not be possible in the space provided. Thus, another iteration of placement is necessary. To limit the number of iterations of the placement algorithm, an estimate of the required routing space is used during the placement phase. A good routing and circuit performance heavily depend on a good placement algorithm.This is due to the fact that once the position of each block is fixed, very little can be done to improve the routing and the overall circuit performance. There are various types of placements. System-level placement : Place all the PCBs together such that Area occupied is minimum and Heat dissipation is within limits. Board-level placement : All the chips have to be placed on a PCB. Area is fixed All modules of rectangular shape. The objective is to , Minimize the number of routing layers and Meet system performance requirements. Chip-level placement : Normally, floor planning / placement carried out along with pin assignment. It has limited number of routing layers (2 to 4). Bad placements may be unroutable. Can be detected only later (during routing). Costly delays in design cycle. Minimization of area. Floorplanning: Floor-plan design is an important step in physical design of VLSI circuits to plan the positions of a set of circuit modules on a chip in order to optimize the circuit performance. In floor-planning, the information of a set of modules, including their areas and interconnection is considered and the goal is to plan their positions on a chip to minimize the total chip area and interconnect cost. In the floor planning phase, the macro cells are positioned on the layout surface in such a way that no blocks overlap and that there is enough space left to complete the interconnections. The input for the floor planning is a set of modules, a list of terminals (pins for interconnections) for each module and a net list, which describes the terminals which have to be connected.
  6. 6. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 6 Different approaches are followed to the floor planning problem. Wimer et al. describe a branch and bound approach for the floor plan sizing problem, i.e. finding an optimal combination of all possible layout-alternatives for all modules after placement. While their algorithm is able to find the best solution for this problem, it is very time consuming, especially for real problem instances. Cohoon et al. implemented a genetic algorithm for the whole floor planning problem. Their algorithm makes use of estimates for the required routing space to ensure completion of
  7. 7. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 7 the interconnections. Another more often used heuristic solution method for placement is Simulated Annealing (c) Routing: The main objective in this step is to complete the interconnections between blocks according to the specified netlist. First, the space not occupied by the blocks (called the routing space) is partitioned into rectangular regions called channels and switchboxes. The goal of a router is to complete all circuit connections using the shortest possible wire length and using only the channels and switchboxes. This is usually done in two phases, referred to as the global routing and detailed routing phases. In global routing, connections are completed between the proper blocks of the circuit disregarding the exact geometric details of each wire and pin. For each wire, the global router finds a list of channels which are to be used as a passage way for that wire. In other words, global routing specifies the ‘‘loose route’’ of a wire through different regions in the routing space. Global routing is followed by detailed routing, which completes point-to-point connections between pins on the blocks. Loose routing is converted into exact routing by specifying geometric information such as width of wires and their layer assignments. Detailed routing includes channel routing and switchbox routing. As all problems in routing are computationally hard, the researchers have focused on heuristic algorithms. As a result, experimental evaluation has become an integral part of all algorithms and several benchmarks have been standardized. Due to the nature of the routing algorithms, complete routing of all the connections cannot be guaranteed in many cases (d).Compaction: The operation of layout area minimization without violating the design rules and without altering the original functionality of layout is called as compaction. The input of compaction is layout and output is also layout but by minimizing area. Compaction is done by three ways: (i) By reducing space between blocks without violating design space rule. (ii) By reducing size of each block without violating design size rule. (iii).By reducing shape of blocks without violating electrical characteristics of blocks. Therefore compaction is very complex process because this process requires the knowledge of all design rules. Due to the use of strategies compaction algorithms are divided into one- dimensional algorithms (either in x-dimension or y-dimension), two dimensional algorithms
  8. 8. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 8 (both in x-dimension and y-dimension) and topological algorithm (moving of separate cells according to routing constraints). Types of compaction techniques: (i) 1-Dimensional compaction: In this technique compaction is done only in one dimension either in x-direction or y-direction until no further compaction is possible . There are two types of constraints which relates to these compaction techniques (i) Separation constraint (ii) Connectivity constraint. (ii).2-Dimensional compaction: In this method compaction is done in both dimension x-dimensions as well as in y-dimension. 2- D compaction is in general much better than performing 1-D compaction. If 2-D compaction, solved optimally, produces minimum-area layouts. The trade off in this technique is the much time consumption. Thus we use 3/2-D Compaction. (iii) 3/2-D Compaction: In this technique the blocks are moved in such a way that it not only compact the circuit but also resolve interferences. Since the geometry is not as free as in 2-D Compaction. In this method two lists are formed one is ceiling another is floor. First is formed by the blocks which are appeared from the top & second is formed by the blocks which are appeared from the bottom. Selects the lowest block in the ceiling list and moves it to the place on the floor which maximizes the gap between floor and ceiling. The process is continued until all blocks are moved from ceiling to floor. VLSI –DSIGN STYLES : Though the partitioning of a physical design decomposes the physical design into several conceptually easier steps, still each step is computationally very hard. So, in order to reduce the the complexity of physical design and to get high yield certain restricted models and design styles are proposed. They are (i) full-custom design style (ii) standard cell design style (iii) gate array design style and (iv) (i).Full-custom design style : It is the most general form of layout in which the circuit is partitioned into a collection of sub- circuits according to some criteria such as functionality of each sub-circuit. In this design style, each sub-circuit is called a functional block or simply a block. The full custom design style allows functional blocks to be of any size. Blocks can be placed at any location on the chip
  9. 9. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 9 surface without restriction. In other words, this style is characterized by the absence of any constraints on the physical design process. This design style allows for very compact designs. But, the process of automating a full-custom design style has a much higher complexity than other restricted models. For this reason, it is used only when final design must have a minimum area and designing time is less of a factor. The full-custom structure of a design is shown below . The automation process for a full-custom layout is still a topic of intensive research. Some phases of physical design of a full-custom chip may be done manually to optimize the layout. Layout compaction is a very important aspect in full-custom. The rectangular solid boxes around the boundary of the circuit are called I–O pads. Pads are used to complete interconnections between chips or interconnections between chip and the board. The space not occupied by blocks is used for routing of interconnecting wires. Initially all the blocks are placed within the chip area, with the objective of minimizing the total area. However, enough space must be left between the blocks to complete the routing. Usually several metal layers are used for routing interconnections. Currently, two metal layers are common for routing and the three-metal layer process is gaining acceptance, as the fabrication costs become more feasible. The routing area needed between the blocks becomes increasingly smaller as more routing layers are used. This is because some routing is done on top of the transistors in the
  10. 10. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 10 additional metal layers. If all the routing can be done on top of the transistors, the total chip area is determined by the area of the transistors. In a hierarchical design of circuit each block in full-custom design may be very complex and may consist of several sub blocks, which in turn may be designed using the full-custom design style or other design styles. It is clear that as any block is allowed to be placed anywhere on the chip, the problem of optimizing area and interconnection of wires becomes difficult. Full-custom design is very time consuming thus, the method is inappropriate for very large circuits, unless performance is of utmost importance. Full-custom is usually used for the layout of chips like microprocessors etc. (ii).Standard cell design style : This is the more restricted design style and the design process is simpler than a full-custom design style. Standard cell methodology considers the layout to consist of rectangular cells of the same height. Initially, a circuit is partitioned into several smaller blocks, each of which is equivalent to some predefined sub-circuit or cell. The functionality and electrical characteristics of each predefined cell are tested, analyzed, and specified. A collection of these cells is called a cell library, usually consisting of 200–400 cells. Terminals on cells may be located either on the boundary or in the center of the cells. Cells are placed in rows and the space between two rows is called a channel. These channels are used to perform interconnections between cells. If two cells to be interconnected lie in the same row or in adjacent rows, then the channel between the rows is used for interconnection. However, if two cells to be connected lie in two nonadjacent rows, then their interconnection wire passes through the empty space between any two cells, or feed through. Standard cell design is well suited for moderate-size circuits and medium production volumes. Physical design using standard cells is simpler as compared to full-custom and efficient using moderndesign tools. The standard cell design style is also widely used to implement the ‘‘random logic’’ of the full-custom design . While standard cell designs are developed more quickly, a substantial initial investment is needed in the development of the cell library, which may consist of several hundred cells. Each cell in the cell library is ‘‘handcrafted’’ and requires a highly skilled design engineer. Each type of cell must be created with several transistor sizes. Each cell must then be tested by simulation and its performance must be characterized.A standard cell design usually takes more area than a full-custom or a handcrafted design.
  11. 11. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 11 However, as more metal layers become available for routing, the difference in area between the two design styles will gradually be reduced. (iii). Gate array design style : This design style is a simplified version of the standard cell design style. Unlike the cells in standard cell designs, all the cells in gate array are identical. The entire wafer is prefabricated with an array of identical gates or cells. These cells are separated by both vertical and horizontal spaces called vertical and horizontal channels. The circuit design is modified such that it can be partitioned into a number of identical blocks. Each block must be logically equivalent to a cell on the gate array. The name ‘‘gate array’’ signifies the fact that each cell may simply be a gate, such as a three-input NAND gate. Each block in the design is mapped or placed onto a prefabricated cell on the wafer during the partitioning / placement phase, which is reduced to a block-to-cell assignment problem. The number of partitioned blocks must be less than or equal to that of the total number of cells on the wafer. Once the circuit is partitioned into identical blocks, the task is to make the interconnections between the prefabricated cells on the wafer using horizontal and vertical channels to form the actual circuit. The uncommitted gate array is taken into the fabrication facility and routing layers are fabricated on top of the wafer. The completed wafer is also called a customized wafer. This simplicity of gate array design is gained at the cost of rigidity imposed upon the circuit both by the technology and the prefabricated wafers. The advantage of gate arrays is that the steps involved for creating any prefabricated wafer are the same, and only the last few steps in the fabrication process actually depend on the application for which the design will be used. Hence, gate arrays are cheaper and easier to produce than full-custom or standard cell. Similar to standard cell design, gate array is also a nonhierarchical structure. The gate array architecture is the most restricted form of layout. It means that it is the simplest for algorithms to work with. For example, the task of routing in gate array is to determine if a given placement is routable. The routability problem is conceptually simpler as compared to the routing problem in standard cell and full-custom design styles. (iv).Field Programmable Gate Array Design (FPGA): In this design, chips are prefabricated with logic blocks and interconnects.Logic and interconnects can be programmed (erased and reprogrammed)by users. No fabrication is needed. Interconnects are predefined wire segments of fixed lengths with switches in between.
  12. 12. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 12 Conclusion: The choice of design style for a particular circuit depends on many factors like functionality of the chip, time-to-market, and the volume of chips to be manufactured. Full- custom is typically reserved for high-performance, high-volume chips, while standard cells are used for moderate performance, where the cost of full-custom cannot be justified. Gate arrays are typically used for low-performance, low-cost applications. A design style may be applicable to the entire chip or a block of the chip. Placement Algorithms : The placement algorithms can be classified on the basis of the input to the algorithms, the nature of output generated by the algorithms, and the process used by the algorithms. Depending on the input, the placement algorithms can be classified into two major groups: constructive placement and iterative improvement methods. The input to the constructive placement algorithms consists of a set of blocks along with the net list. The algorithm finds the locations of blocks. On the other hand, iterative improvement algorithms start with an initial placement. These algorithms modify the initial placement in search of a better placement. These algorithms are typically used in an iterative manner until no improvement is possible. Simulated Annealing Placement Algorithm : This is one of the popular placement algorithm used in the VLSI design process .The simulated annealing technique has been successfully used in many phases of VLSI physical design,e.g., circuit partitioning. Simulated annealing is used in placement as an iterative improvement algorithm. Given a placement configuration, a change to that configuration is made by moving a component or interchanging locations of two components. In the case of the simple pair wise interchange algorithm, it is possible that an achieved configuration has a cost higher than that of the optimum, but no interchange can cause a further cost reduction. In such a situation, the algorithm is trapped at a local optimum, and cannot proceed further. Actually, this happens quite often when this algorithm is used in real-life examples. Simulated annealing avoids getting stuck at a local optimum by occasionally accepting moves that result in a cost increase. The simulated Annealing Algorithm is given below. Begin temp = INIT-TEMP; place = INIT-PLACEMENT;
  13. 13. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 13 while (temp > FINAL-TEMP) do while (inner_loop_criterion = FALSE) do new_ place = PERTURB (place); ΔC = COST(new_ place) - COST (place); if (ΔC < 0) then place = new_ place; else if (RANDOM(0,1) > e T then place = new_place; temp = SCHEDULE(temp); End; In simulated annealing, all moves that result in a decrease in cost are accepted. Moves that result in an increase in cost are accepted with a probability that decreases over the iterations. The analogy to the actual annealing process is heightened with the use of a parameter called temperature T. This parameter controls the probability of accepting moves that result in an increased cost. Additional moves are accepted at higher values of temperature than at lower values. The acceptance probability can be given by e ΔC/T where ΔC is the increase in cost. The algorithm starts with a very high value of temperature, which gradually decreases so that moves that increase cost have a lower probability of being accepted. Finally, the temperature reduces to a very low value which causes only moves that reduce cost to be accepted. In this way the algorithm converges to an optimal or near-optimal configuration. This random shuffling could be achieved by displacing a block to a random location, an interchange of two blocks, or any other move that can change the wire length. After the shuffle, the change in cost is evaluated. If a decrease in cost occurs, the configuration is accepted; otherwise, the new configuration is accepted with a probability that depends on the temperature. The temperature is then lowered using some function which, for example, could be exponential in nature. The process is stopped when the temperature has dropped to a certain level. The parameters and functions used in a simulated annealing algorithm determine the quality of the placement produced. These parameters and functions include the cooling schedule consisting of initial temperature (init_temp), final temperature ( final_temp), and the function used for changing the temperature (SCHEDULE), inner_loop_criterion, which is the number of trials at each temperature, the
  14. 14. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 14 process used for shuffling a configuration (PERTURB), acceptance probability (F), and the cost function (COST).A good choice of these parameters and functions can result in a good placement in a relatively short time. Backtrack Search Algorithm: In finding a solution for any problem , a sequence of decision points are considered , in which each choice leads to further along some path. If the correct set of choices are made ,it will end up at the solution. On the other hand, if a dead end is reached or otherwise found that an incorrect choice is made then a back track is followed to a previous decision point and try a different path. Algorithms that use this approach are called backtracking algorithms So,this algorithm uses recursive depth-first search to explore the search space. Depth-first search keeps going down a path as long as it can. If it reaches a node with no children (‘dead end"), then it back tracks to its parent and tries another child node that it hasn't already explored. If it has explored all child nodes, then it backtracks up one more level and continues. If the average branching factor is b and the depth of the search tree is k, then backtrack search requires O(bk ) time, which is exponential. Classic examples of problems requiring backtrack search. Branch and Bound algorithm: Branch and bound is a variant of backtracking search that takes advantage of information about the optimality of partial solutions to avoid considering solutions that cannot be optimal. So we are still doing an exhaustive search but potentially avoiding exploring large parts of the search space that are not going to give us a solution. Given an initial problem and some objective function f to be minimized, the branch and bound technique works as follows.
  15. 15. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 15 If the problem is small enough, then solve it directly. Otherwise the problem is decomposed into two or more sub problems. Each sub problem is characterized by the inclusion of one or more constraints. For each sub problem, we compute a lower bounding function g. This lower bound represents the smallest possible cost of a solution to the sub problem,given the constraints on the given sub problem. Simulation: The objective behind any simulation tool is to create a computer based model for the design verification and analyzing the behavior of circuits under construction also checking the current level of abstraction. Types of Simulation : Device level simulation . Circuit level simulation . Timing level & Macro level simulation. Switch level simulation. Gate level simulation. RTL simulation. System level simulation. Device level simulation : This model involves with a semiconductor device like a MOS transistor used to test the effect of fabrication parameters .Simulator techniques based on finite- element method are used for this purpose. Circuit level simulation:It deals with small groups of transistors modeled in the analog domain .The variables computed are currents and voltages and the computations are based on numerical methods. Switch level simulation: This simulation method , models the MOS transistors as switches,that pass signals .The values of signals are discrete ,but it also includes certain analog features to combine certain components like resistance and capacitance. Gate level simulation : In this model a circuit is composed of several logic gates connected by uni-directional memory less wires. The logic gates themselves are collections of transistors and other circuit elements which perform a logic function. A logic gate may be a simple inverter or NAND gate or NOR gate or more complex functional unit like a flip-flop or register. Register –Transfer Level (RTL) simulation: This model is used synchronous circuits where all registers are controlled by a system clock signal.The registers store the state of the system ,while the combinational logic computes the next state and the output based on the current state and the
  16. 16. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 16 input. Here the important consideration is the state transitions and the precise timing of intermediate signals in the computation of the next state is not considered. System level Simulation : It deals with the hardware described in terms of primitives that need not correspond with hardware building blocks. VHDL is the most popular hardware description language used for system level simulation.When used in the initial stages of a design ,it can describe the behavior of a circuit as a processor as a set of communicating processes. Gate Level Modeling and Simulation : The gate level model forms the theoretical basis for the logic design. In this model a circuit is composed of several logic gates connected by uni-directional memory less wires. The logic gates themselves are collections of transistors and other circuit elements which perform a logic function. A logic gate may be a simple inverter or NAND gate or NOR gate or more complex functional unit like a flip-flop or register .The logic gates compute the Boolean functions correspond to their input signals and transmit the values along wires to the inputs of other gates to which it is connected.Each input of the gate has a unique signal source. Information is stored only in feedback paths of sequential circuits. Gate level modeling and simulation is classified into following four types. Signal Modeling :A signal modeling deals with the signal applied to a logic gate. Normally the Boolean signals are denoted by either 0 or 1.A signal which is neither 0 nor 1 is denoted by X. This indicates a transition from one state to other. Here X denotes the unknown value.The more values are used for a signal ,the more complex is the modeling of a gate.If the gate has n inputs signals ,with each signal having N values ,the output for Nn should be specified. The logic involved in dealing with a circuit modeled using multiple –valued discrete signals is called multiple-valued logic. Gate Modeling: This is useful to model the behavior of even a single gate .The model should be such that signal values at the gates outputs are efficiently computed as a function of gates
  17. 17. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 17 inputs.The outputs of a gate are represented by either truth table representation or Sub-routine representation. Delay Modeling: At the gate level ,time is modeled ina discrete way and all delays in the circuit are expressed as an integer multiple of a time unit. The output of any physical gate will take some time to switch after the moment that the input is switched. The delay occurred here can affect the correct functionality of the circuit, especially when the circuit is asynchronous. So,a correct modeling of the delays is needed. The important delay models are Propagation Delay model : It is associated with a fixed delay at the gates output. So,any effect of switching inputs is observed at the output of after certain delay. Rise fall Dealy model: This model is related to rise and fall in the output of a gate.It always takes some time ro come back to normal state after rise or fall of a signal. Inertial Delay model:The input pulse should have a minimum width in order to have any effect at the output.Inertial delays occurs due to the capacitive elements in the gate. The inertial delays can be combined with the propagation and rise fall delay models. Connectivity Modeling : This model is related to the suitable connections of all gates in the network .For this the simulator should have suitable data structures to represent the connectivity. The unilateral nature of logic gates is the basis to the operation of gate level simulators. For each binary vector at the input nodes of a logic gate ,the binary value (0 or 1) at the output is computed and propagated on the inputs of other gates that are connected to it. Here during the propagation of the signal a certain time delay occurs due to the inertial elements like node capacitances present in the circuit. The simulators which do not consider this delay can analyze only combinatorial circuits.So,the simulators which handle sequential circuits must estimate the propagational delay through a logic gate in several ways.Some simulators operate in the unit- delay mode where all logic gates are assumed to have the same delay. But these Unit –delay simulators can verify only the steady –state behavior or the logic functionality of the digital circuit. The difference in the propagation delays through different signal paths in a network of logic gates ,sometimes cause undesirable situations like static Hazards and dynamic Hazards.Hazards are situations ,where a spurious glitches or spikes occur in an otherwise smooth analog waveform at the output of a logic gate.
  18. 18. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 18 Compiler Driven Simulation: There are two basic mechanisms to simulate a circuit at the gate level.They are (i) Compiler- driven simulation and (ii) Event-driven simulation The compiler driven simulation occurs in synchronous circuits.The core of such circuits consists of registers that store the state of the system and combinational logic that computes the next state. Event driven simulation is developed by the fact that ,under normal conditions ,very few gates switch simultaneously and that computing signal propagation through all gates in the network over and over again at each time instant leads to unnecessary computations. So,it is economical to compute only those signals that are actually changing their states. A change in signal state is called an event , hence this simulation is called Event –driven Simulation. In a sequential circuit ,the occurrence of a glitch could cause the circuit to mal-function. Therefore ,the detection of Hazards and race-conditions are very important and as result ,most digital simulators generates an alert to the user when they occur. The detection of Hazards is possible by introducing a third state , denoted by X ,which denotes a signal-transition. Many simulators use a third value to represent an unknown or undefined logic level denoted by X.This X state indicate an uninitialized signal ,a signal held between two logic thresholds or signal in a 0 → 1 or 1 → 0 transition.The X state is handled algebraically by extending the binary Boolean algebra to a ternary or three valued De Morgan’s algebra which preserves most of the desired properties of Gate model. Some other simulators also implement the X-state by an enumeration technique in which the simulation is repeated with the nodes in the X-state set to all possible combinations of 0’s and 1’s.Nodes that remain in a unique binary state for all combinations are set to this state,whicl others are set to X. To simulate tri-state gates and logic buses ,some simulators use a fourth state called the High Impedance state and normally denoted by H(or Z also).This H state is also used some times to model dynamic memory by allowing a node to retain its previous logic state,if the outputs of all logic gates connected to the node are at the H-level.
  19. 19. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 19 Gate level simulators are not completely suitable for the logic simulation of MOS circuits. Because ,there is a mismatch between the Boolean gate model and the behavior of the MOS logic circuits. Hence ,there is a need of different approach to the digital modeling and simulation of MOS circuits ,which is nothing but switch level logic simulation. Switch level modeling and simulation: For simulation of MOS circuits ,these switch level simulators are developed.One of the first switch-level simulators to be implemented is MOSSIM. In contrast to the gate-level modeling and simulation ,the switch leveltechniques operate directly on the transistor circuit structure and capture many circuit properties that are not possible in gate level model. For example bi-directionality of signal flow ,charge sharing effects and transistor sizes.In contrast to circuit level simulation ,node voltages are represented by discrete logic levels and transistors by bi-directional resistive switches in switch –level modeling. So, a MOS circuit is modeled by a set of nodes connected by resistive switches.Node voltages are represented by a small number of discrete logic levels like 0 ,1,X or U .Here U denotes undefined or unknown.In some cases the high impedance state Z is also used. The resistive switch model of a transistor is controlled by the voltage level at its gate terminal.An n-type transistor is conducting when its gate voltage is 1 and a p-type transistor is conductiong when its gate voltage is 0.Transistors are allowed to have discrete strength values depending on the values of their conductances when fully ON.This is done to model the behavior of ratioed logic. As an example, a depletion load transistor used in n-channel MOS circuit design has its gate logic level set to 1 and its strength is weaker than that of an enhancement type transistor. Transistors in series are equal to a single transistor of strength equal to the weakest one, while transistors in parallel are equivalent to a single transistor of strength equal to the strongest one(maximum conductance). In most of the switch level simulators ,the circuit is partitioned into channel-connected sub- circuits.This partitioning can be done at once at the outset,where every transistor is included or
  20. 20. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 20 dynamically at every iteration where only conducting transistors are included.This dynamic partitioning adds some additional overhead cost in the design. The simulation of the entire circuit follows an event scheduler similar in many ways to gate – level logic simulators, except that now the gates consists of channel –connected transistors. LOGIC SYNTHESIS –VERIFICATION INTRODUCTION: Logic synthesis is the process of converting a high-level description of design into an optimized gate-level representation. Logic synthesis uses a standard cell library which have simple cells, such as basic logic gates like and, or, and nor, or macro cells, such as adder, muxes, memory, and flip-flops. Standard cells put together are called technology library. Normally the technology library is known by the transistor size (0.18u, 90nm). Boolean Logic Functions Representation: There are various methods in use to represent the logic functions. For example (i) Truth table method ,(ii) equation method (iii) K-map method and (iv) , circuit method etc… But all these methods are not standard or unique or Canonical. To synthesize ,optimize or to verify or manipulate the large Boolean functions ,they must be represented efficiently by using suitable methods. One such method to represent the complex Boolean functions is Binary Decision Diagrams(BDD).The BDD method ,which is canonical is the most popular method among others. Binary Decision Diagram (BDD) Binary decision diagram (BDD) is a graphical representation of a Boolean function, which is derivable from Shannon’s expansion theorem . It is similar to binary tree .So, a binary decision diagram (BDD) is a finite DAG (Directed Acyclic Graph) with the following features. It has a unique initial node, all non-terminals labelled with a Boolean variable, Aall terminals labeled are with 0 or 1, All edges are labelled with 0 (dashed edge) or 1 (solid edge), Each non-terminal has exactly 1 out-edge labeled 0 and 1 out-edge labeled 1.
  21. 21. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 21 Shannon’s Expansion Theorem : The Shannon’s expansion theorem is used iteratively to build any BDD for a given Boolean function. Shannon’s expansion theorem states that “Any switching function of n variables can be expressed as a sum of products of n literals, one for each variable’. Let us assume that f (x1, x2, ..., xn) is a switching function of n variables. According to Shannon , one way of expressing this function is f(x1, x2, ..., xn) = x1f(1, x2, ..., xn) + x1'f(0, x2, ..., xn) On the right side, the function is the sum of two terms, one of them relevant when x1 is equal to 1 and the other when x1 is equal to 0 . The first term is x1 times what remains of f when x1 is equal to the value 1 and the second term is x1' times what remains of f when x1 is equal to 0. Shannon’s expansion theorem in the general case is f = a0 x1'x2' ... xn' + a1x1'x2' ... xn–1'xn + a2x1'x2' ... xn–1 xn' + ...+ a2n–2 x1x2... xn' + a2n–1 x1x2... xn Each ai is a constant in which the subscript is the decimal equivalent of the multiplier of ai viewed as a binary number. Thus, for three variables, a5 (binary 101) is the coefficient of x1x2'x3. In a similar way it can be stated as “ any switching function of n variables can be expressed as a product of sums of n literals, one for each variable “. Binary Decision Diagram (BDD)- Example : Let us consider an example of constructing a BDD.The output of a Boolean function S is given by the truth table and the function is denoted by the MUX based circuit shown in the diagram. Figure (i) Truth table Figure (ii) Mux based circuit
  22. 22. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 22 This Boolean function is denoted by the Binary Decision Diagram as shown below figure ( c) Figure ( c) - Binary Decision Diagram In the BDD , the line with a bubble on it denotes that value = 0 and the Lines without bubble denote the value = 1. Let us consider S(0,0,0) in Figure (d) and S(1,1,1) in Figure (e). There are several methods to denote the value = 1 and value = 0 .For example Bubble vs. Non-bubble line Dashed line vs. Solid line T (then) vs. E (else) labels
  23. 23. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 23 (i)-Bubble –no bubble (ii) Dashed line-solid line (iii) T(then) vs E(else) label BDDs for Basic Logic Functions : The following diagrams denote some logic functions NOT , OR ,AND etc..
  24. 24. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 24 Ordered Binary Decision Diagram (OBDD) A ,Binary Decision Diagram is said to be an ordered binary decision diagram if the variables appear in the same order along all paths from the root to the leaves Let [x1,x2,-------------xn] be an ordered list of variables without duplicates. This BDD is said to have ordering [x1,x2,x3,---------xn] if (i) all variable labels occur in the order [x1,x2,---------xn] and if xj follows xi along a path such that j > i. For variables a,b,c,d the ordering should be such that a ≤ b ≤ c ≤ d as shown in the diagram below. (a) (b) The diagram (b) above, shows an optimal ordering because, there is exactly one node for each variable. The order is b ≤ c ≤ a ≤ d . Reduction operations(ROBDD) : 1. Removal of duplicate terminals. If a BDD contains more than one terminal 0-node, then redirect all edges which point to such a 0-node to just one of them. Proceed in the same way with terminal nodes labelled with 1. 2. Removal of redundant tests. If both outgoing edges of a node n- point to the same node m, then eliminate that node n,sending all its incoming edges to m 3. Removal of duplicate non-terminals. If two distinct nodes n and m in the BDD are the roots of structurally identical sub BDDS, then eliminate one of them, say m, and redirect all its incoming edges to the other one. A BDD is reduced if it has been simplified as much as possible using these reduction operations Examples: (i). remove duplicate terminals
  25. 25. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 25 (ii).Remove redundant tests. (iii).Remove duplicate non-terminal (iv).Remove duplicate non-terminal and redundant tests
  26. 26. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 26 A BDD is reduced if it has been simplified as much as possible using these reduction operations BDD Canonical Form : The Binary Decision Diagrams are said to be canonical(unique) for a given ordering if All internal nodes are descendants of some node and there are no isomorphic sub-graphs and for every node fT ≠ fE . Advantages of Canonical representation: Representing a Boolean function in canonical form ,has the advantage of checking for redundant variables easily .A Boolean function f does not depend on an input variable x if no nodes occur for x in the Reduced ordered Binary Decision Diagram. The equivalence of two functions f and g can be easily done by seeing the structure of ROBDDs.The various manipulations on BDDs can be performed directly if the function is denoted in canonical form.
  27. 27. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 27 HIGH-LEVEL SYNTHESIS INTRODUCTION: The basic objective of high-level synthesis is the mapping of a behavioral description of a digital system into an RTL design consisting of a data path and a control unit. So, the High-level synthesis (HSL) is a sequence of tasks that transforms a behavioral representation into an RTL design. The design consists of functional units such as ALUs and multipliers, storage units such as memories and register files, and inter connection units such as multiplexers and buses. One of the tasks of synthesis is to find the structure that best meets the constraints, such as limitations on cycle time , area or power, while minimizing other costs. For example, the goal might be to minimize area while achieving a certain minimum processing rate. From that input specification, the synthesis system produces a description of a register-transfer structure that realizes the specified behavior. This structure includes a data path, that is, a network of registers, functional units, multiplexers and buses, as well as hardware to control the data transfers in that network. If the control is not integrated into the data path the synthesis system must also produce the specification of a finite state machine that drives the data paths so as to produce the required behavior. The control specification could be in terms of microcode, a PLA profile or random logic. High-level synthesis is different from the logic synthesis in a sense that the logic synthesis is specified in terms of logic equations, which must be optimized and mapped into a given technology. Logic synthesis is used on a design after high-level synthesis has been done. So,we can say that High-level synthesis (HLS) is a new step in the design flow of a digital electronic circuit, moving the design effort to higher abstraction levels. The advantage of High-level synthesis is that it raises the design abstraction level and allows rapid generation of optimized RTL hardware for performance, area, and power requirements. In addition , there are several advantages arise from the use of HLS in the design flow. First of all, the amount of code to be written by designers is reduced dramatically, which saves time and reduces the risk of mistakes.
  28. 28. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 28 Verification time, is reduced a lot because the HLS tool can, in addition to the design itself, generate test benches, thereby reusing test data that was used to validate the source code.This is particularly relevant for the design of FPGA based embedded systems. HARDWARE MODELS : All HLS systems need to restrict the target hardware. Most systems generate synchronous hardware and build it with the following parts: Functional units : They can perform one or more computations, e.g. addition, multiplication, comparison, ALU. Registers:They store inputs, intermediate results and outputs; sometimes several registers are taken together to form a register file. Multiplexers: From several inputs, one is passed to the output.
  29. 29. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 29 Busses: a connection shared between several hardware elements, such that only one element can write data at a specific time. Three-state (tri-state) drivers control the exclusive writing on the bus. Hardware is normally partitioned into two parts: The data path : A network of functional units, registers, multiplexers and buses. The actual ‘‘computation’’ takes place in the data path. Control: The part of the hardware that takes care of having the data present at the right place at a specific time, of presenting the right instructions to a programmable unit, etc. Often high-level synthesis concentrates on data-path synthesis. The control part is then realized as a finite state machine or in microcode. Synthesis tasks High-level synthesis maps a behavioral description into the FSMD model so that the data path executes variable assignments and the control unit implements the control constructs. Since the FSMD model [FSM with a data path] determines the amount of computation in each state, so,one must first define the number and type of resources (storage units, functional units, and interconnection units) to be used in the data path. Allocation is the task of defining necessary resources for a given design constraint. The next task in mapping a behavioral description into an FSMD model is to partition the behavioral description into states (or control steps) so that the allocated resources can compute all the variable assignments in each state. This partitioning of behavior into time intervals is called scheduling. Although scheduling assigns each operation to a particular state, it does not assign it to a particular component. To obtain the proper implementation, we assign each variable to a storage unit, each operation to a functional unit, and each transfer from l/O ports to units and among units to an interconnection unit. This task is called binding (or resource sharing). Binding defines the structure of the data path but not the structure of the control unit. The final task, control synthesis, consists of reducing and encoding states and deriving the logic network for next-state and control signals in the control unit. Control synthesis employs well-known logic synthesis. Allocation. The allocation task determines the type and quantity of resources used in the chip architecture. It also determines the clocking scheme, memory hierarchy, and pipelining style.
  30. 30. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 30 The goal of allocation is to make appropriate trade-offs between the design’s cost and performance. If the original description contains inherent parallelism, allocating more hardware resources increases area and cost, but it also creates more opportunities for parallel operations or storage accesses, resulting in better performance. On the other hand, allocating fewer resources decreases area and cost, but it also forces operations to execute sequentially , resulting in poorer performance. To perform the required tradeoffs, allocation must determine the exact area and performance values. Scheduling : The next step schedules operations and memory accesses into clock cycles. Scheduling algorithms are of two types, based on the optimization goal and the specified constraints.I f the user has completely specified all the available resources and the clock cycle length during allocation, the scheduling algorithm’s goal is to produce a design with the best possible performance, or the fewest clock cycles. In other words, scheduling must maximize usage of the allocated resources. We call this approach resource-constrained scheduling. If a list of resources is not available prior to scheduling, but a desired overall performance is specified, the scheduling algorithm’s goal is to produce a design with the lowest possible cost, or the fewest functional units. This is the time constrained scheduling approach. Resource-constrained scheduling usually constructs the schedule one state at a time. It schedules operations so as not to exceed resource constraints or violate data dependencies. It ensures that at the instant for which it schedules an operation Oi into control step Sj, a resource capable of executing Oi is available and all the predecessors of node Oi have been scheduled. Binding : The binding task assigns the operations and memory accesses within each clock cycle to available hardware units. A resource such as a functional, storage, or interconnection unit can be shared by different operations, data accesses, or data transfers if they are mutually exclusive. For example, two operations assigned to two different control steps are mutually exclusive since they will never execute simultaneously; hence they can be bound to the same hardware unit. Binding consists of three subtasks based on the unit type.
  31. 31. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 31 Scheduling Algorithms: The scheduling is very important because it determines the amount of concurrency of the resulting implementation .It also determines the amount of hardware resources required and with minimum area. There are two important scheduling algorithms. They are ASAP(As soon as possible ) and ALAP(As late as possible). ASAP Scheduling: It is a unconstrained minimum-latency scheduling problem.In this situation ,we have infinite resources, all that is needed is the minimum time to perform the computation. Hence it is referred to as ASAP (as soon as possible) scheduling. The code for the ASAP scheduling is given below. ASAP( GS(V,E) ){ Schedule v0 by setting t0 = 1 repeat{ Select a vertex vi whose predecessors are all scheduled; Schedule vi by setting ti = max tj + dj } until (vn is scheduled); return t; } Explanation: The various steps are explained below. Step1: Schedule v0 at time 1 Step2 : Select a vertex vi whose predecessors are all scheduled Step3 : Schedule vi to time = predecessor’s scheduled time + time required for predecessor to execute Step4 : Has vn been scheduled yet? No, Repeat loop. ALAP Scheduling : It is a Latency constrained scheduling problem.Here Schedule must satisfy an upper bound on latency .Hence it is referred to as ALAP (as late as possible) scheduling. ALAP Scheduling goal is to schedule tasks/operations to perform as late as possible The code for the ALAP scheduling is given below.
  32. 32. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 32 ALAP( GS(V,E), λ ) { ; Perform ALAP scheduling on the sequencing graph, λ is the upper time bound Schedule vn by setting tn = λ + 1 ; Schedule the sink node vn for upper latency bound + 1 repeat { Select a vertex vi whose successors are all scheduled ; Look for tasks/operations whose Successors are already scheduled Schedule vi by setting ti = min tj - dj ; Schedule the task/operation to time = time successor scheduled for - time required for successor to execute may have multiple successors,take minimum time } until (v0 is scheduled) ; Keep going until we have scheduled the source node v0 return t; } Mobility : In ASAP scheduling we compute the earliest time at which an operation is scheduled. One can also perform the opposite and compute the latest time at which an operation can be scheduled.The ALAP scheduling has the same disadvantages as ASAP scheduling. The combination of both scheduling gives more powerful heuristics. Mobility is used as a basis for several scheduling heuristics. So, Mobility or slack is an important quantity used by some scheduling algorithms.Mobility is given by Mobility = start time ALAP scheduling – start time ASAP scheduling When Mobility = 0, task/operation can only be started at the given time in order to meet overall latency constraint. Similarly Mobility > 0, indicates the span of possible start times and Helps with minimizing resources (adders, multipliers, etc.). Hence , the concept of Mobility can be used to improve resource requirements. Physical Design Automation of FPGAs INTRODUCTION: FPGA is a new approach to ASIC design that can dramatically reduce manufacturing turnaround time and cost. In its simplest form, an FPGA consists of a regular array of programmable logic blocks interconnected by a programmable routing network. A programmable logic block is a RAM and can be programmed by the user to act as a small logic module. Given a circuit, user
  33. 33. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 33 can program the programmable logic module using an FPGA programming tool. The key advantage of FPGAs is re-programmability .The RAM nature of the FPGAs allows for in-circuit flexibility that is most useful when the specifications are likely to change in the final application. In some applications such as remote sensors, it is necessary to make system updates via software. In FPGA, a data channel is provided, which allows easy transfer of the new logic function and reprogramming the FPGA. The physical design automation of FPGAs involves mainly three steps. They are partitioning, placement and routing. Partitioning problem in FPGAs is significantly different from the partitioning problems in other design styles .This problem mainly depends on the architecture in which the circuit has to be implemented. Placement problem in FPGAs is very similar to the gate array placement problem. The routing problem in FPGAs is to find a connection path and program the appropriate interconnection points. FPGA Technologies : An FPGA architecture mainly consists of two parts : the logic blocks, and the routing network. A logic block has a fixed number of inputs and one output. A wide range of functions can be implemented using a logic block. Given a circuit to be implemented using FPGAs, it is first decomposed into smaller sub-circuits such that each of the sub-circuit can be implemented using a single logic block. There are two types of logic blocks. The first type is based on Look-Up Tables (LUTs), while second type is based on multiplexers. Look-up table based logic blocks: A LUT based logic block is just a segment of RAM. A function can be implemented by simply loading its LUT into the logic block at power up. If a function needs to be implemented, then its truth table is loaded into the logic block. In this way, on receiving a certain set of inputs, the logic blocks simply ‘look up’ the appropriate output and set the output line accordingly. Because of the reconfigurable nature of the LUT based logic blocks, they are also called the Configurable Logic Blocks (CLBs). It is clear that bits are required in a logic block to represent abit input, 1- bit output combinational logic function. Multiplexer based logic blocks: Typically a multiplexer based logic block consist of three 2-to- l multiplexers and one two-input OR gate as shown in Figure below.
  34. 34. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 34 The number of inputs is eight. The circuit within the logic block can be used to implement a wide range of functions. One such function, shown in Figure (a) can be mapped to a logic block as shown in Figure (b). Thus, the programming of multiplexer based logic block is achieved by routing different inputs into the block. There are two models of routing network , the segmented and the non-segmented routing network . Physical Design Cycle for FPGAs :The physical design cycle for FPGAs consists of the following three important steps: .Partitioning: The circuit to be mapped onto the FPGA has to be partitioned into smaller sub- circuits, such that each sub-circuit can be mapped to a programmable logic block. Unlike the partitioning in other design styles, there are no constraints on the size of a partition. However, there are constraints on the inputs and outputs of a partition. This is due to the unique architecture of FPGAs. Placement: In this step of the design cycle, the sub-circuits which are formed in the partitioning phase are allocated physical locations on the FPGA, i.e., the logic block on the FPGA is programmed to behave like the sub-circuit that is mapped to it. This placement must be carried out in a manner that the routers can complete the interconnections. This is very critical as the routing resources of the FPGA are limited.
  35. 35. Dr.Y.NARASIMHA MURTHY Ph.D yayavaram@yahoo.com 35 Routing: In this phase, all the sub-circuits which have been programmed on the FPGA blocks are interconnected by blowing the fuses between the routing segments to achieve the interconnections. Figure above shows the complete physical design cycle of FPGAs. System design is available as a directed graph which is partitioned in second step. Placement involves mapping of sub- circuits onto CLBs. Shaded rectangles represent CLBs which have been programmed. Final step is routing of channels. --------------xxxxxxxxxxx--------------- References: 1.Algorithms for VLSI Design Automation –S.H.Garez 2. ECE 474a/575a Susan Lysecky. 3. Algorithms for VLSI Physical Design Automation - Naveed A. Sherwani.

×