Your SlideShare is downloading. ×

Design of chip controller


Published on

its of control unit with alu unit and memory

its of control unit with alu unit and memory

Published in: Engineering

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 2. - Introduction Digital design is a broad and amazing field. The application of digital design is present in our daily life, including computers, calculators, video cameras etc. In fact there will be always need for high speed and low power digital products which makes digital design a future growing business. ALU (arithmetic logic unit) is a critical component of a microprocessor and central processing unit. Furthermore it is the heart of the instruction execution portion of every computer. ALU's comprises of combinational logic that implements logical operations such as AND, OR etc., and arithmetic operations such as ADD, SUB etc. ALU can be built with various specifications. A simple ALU has two inputs for operands and one input for control signal that selects the operation and one output for the result. The goal of this project is to design a CHIP CONTROLLER consisting of a control unit, 16 bit ALU with memory which executes various arithmetic and logical operations. The hardware uses accumulator or registers to store each result. When an input operand has been read and the appropriate control signal has been passed to the control unit will perform the computation and output the result. The control unit provides the necessary timing and control signals to all the operations in the ALU. - 2 - -
  • 4. BLOCK DIAGRAM AND ITS FUNCTIONALITY 2.1 BLOCK DIAGRAM The main blocks of the processor are Control unit ALU Memory CONTROL UNIT It is the main block of the processor which controls the ALU. The 16 bit inputs, one 6 bit selection line and clock input are given to this control unit. So that for every positive edge of the clock the control unit must take the input and also output is given by control unit. The control unit then tells the ALU what operation to be performed on the data with the help of selection lines. It also tells the ALU whether to access memory or not. ALU As far as the ALU in our design is concerned, it loads data from two 16 bit data lines. The ALU performs operations on that data according to the instructions given by the control unit. - 4 - - CONTROL UNIT 16-BIT ALU MEMORY 16 X 65KBITS [15:0]INPUT1 [15:0]INPUT2 [5:0]SELECT CLOCK [15:0]OUTPUT CARRY RW [15:0]OUT CARRY [15:0]INPUT1 [15:0]INPUT2 [3:0]OPC [15:0]ADDR [15:0]DATAIN [15:0]DATAOUT RW EN EN
  • 5. BLOCK DIAGRAM AND ITS FUNCTIONALITY These instructions are given using opcode to ALU. ALU in our design can perform arithmetic operations like addition, subtraction, multiplication, comparison and logic operations like AND, OR, XOR, XNOR, NOR, BUFFER, NOT. The ALU which is used in our design is static ALU. MEMORY The memory which is used in our design is Harvard memory because it can be used as data memory only. Memory takes the address from 16 bit address line. It uses two lines for reading and writing data from and to memory respectively. The size of memory used is 16 x 64 kbits. - 5 - -
  • 6. 2.2 OPCODES S.N O OP- CODE SELECTION LINES FUNCTION MATHEMATICAL REPRESENTAT- ION S[5] S[4] S[3] S[2] S[1] S[0] 1 32 1 0 0 0 0 0 Addition A + B 2 33 1 0 0 0 0 1 Subtraction A - B 3 34 1 0 0 0 1 0 Multiplication A * B 4 35 1 0 0 0 1 1 Or A | B 5 36 1 0 0 1 0 0 And A & B 6 37 1 0 0 1 0 1 Xor A ^ B 7 38 1 0 0 1 1 0 Not ~ A 8 39 1 0 0 1 1 1 Xnor A ~^ B 9 41 1 0 1 0 0 1 Comparison A>B,A<B,A=B 10 46 1 0 1 1 1 0 Buffer A 11 47 1 0 1 1 1 1 Buffer B 12 48 1 1 0 0 0 0 Addition A + MEM(B) 13 49 1 1 0 0 0 1 Subtraction A - MEM(B) 14 50 1 1 0 0 1 0 Multiplication A * MEM(B) 15 51 1 1 0 0 1 1 Or A | MEM(B) 16 52 1 1 0 1 0 0 And A & MEM(B) 17 53 1 1 0 1 0 1 Xor A ^ MEM(B) 18 54 1 1 0 1 1 0 Not ~ MEM(A) 19 55 1 1 0 1 1 1 Xnor A ~^ MEM(B) 20 57 1 1 1 0 0 1 Comparison A > MEM(B), A < MEM(B), A = MEM(B) 21 62 1 1 1 1 1 0 Buffer Move from memory 22 15 0 0 1 1 1 1 Buffer Move to memory In selection lines each bit has its own significance. First four bits are used for selecting a particular operation that has to perform on input data. Fifth bit tells the ALU - 6 - -
  • 8. ` DESIGN METHODOLOGY 3.1 DESIGN METHODOLOGY There were several ways to approach creating the ALU. Our group wanted to make use of the CAS (Complimentary Addition and Subtraction) unit so that 2’s complement arithmetic could be performed in one operation rather than several executions through the ALU. This led to the design in which multiple functions being executed simultaneously and the desired output was chosen using a multiplexer network 3.2 ALU DESIGN The ALU designed in this project performs ten different operations on two 16-bit inputs with and without using memory. The advanced design utilizes the carry-look-ahead method for carry generations in order to speed up the performance of the ALU. 3.3 ADDER/SUBTRACTOR UNIT 3.3.1 CARRY LOOK AHEAD GENERATOR: The parallel adder discussed in the last paragraph is ripple carry type in which the carry output of each full-adder stage is connected to the carry input of the next higher order stage. Therefore, the sum and carry outputs of any stage cannot be produced until the input carry occurs; this leads to a time delay in the addition process. This delay is known as carry propagation delay, which can be best explained by considering the following addition, 0 1 0 1 + 0 0 1 1 = 1 0 0 0 Addition of the LSB position produces a carry into the second position. This carry, when added to the bits of the second position (stage), produces a carry into the third position. The key thing to notice in this example is that the sum bit generated in the last position (MSB) depends on the carry that was generated by the addition in the previous positions. This means that, adder will not produce correct result until LSB carry has propagated through the intermediate full-adders. This represents a time delay that depends on the propagation delay produced in an each full-adder. For example, if each full-adder is considered to have a propagation delay of 30 ns, then S3 will not reach its correct value until 90 ns after LSB carry is generated. Therefore, total time required to perform addition is 90+30=120 ns. - 8 - - COUT
  • 9. ` DESIGN METHODOLOGY Obviously, this situation becomes much worse if we extend the adder circuit to add a greater number of bits. If the adder were handling 16-bit numbers, the carry propagation delay could be 480 ns. One method of speeding up this process by eliminating inter stage carry delay is called look ahead-carry addition. This method utilizes logic gates to look at the lower order bits of the augends and addend to see if a higher order carry is o be generated. It uses two functions: carry generate and carry propagate. Consider the circuit of the full-adder show in fig 1. Here, we define two functions: carry generate and carry propagate. Pi = Ai + Bi Gi = AiBi The output sum and carry can be expressed as Si = Pi + Ci Ci+1 = Gi + PiCi Gi is called a carry generate and it produces on carry when both Ai and Bi are one, regardless of the input carry. Pi is called a carry propagate because it is term associated with the propagation of the carry form Ci to Ci+1. Now the Boolean function for the carry output of each stage can be written as follows, C1 = G1 + P1C1 C3 = G2 + P2C2 = G2 + P2 (G1 + P1C1) = G2 + P2G1 + P2P1C1 C4 = G3 + P3C3 = G3 + P3 (G2 + P2G1 + P2P1C1) = G3 + P3G2 + P3P2G1 + P3P2P1C1 From the above Boolean function it can be seen that C4 does not have to wait for C3 and C2 to propagate; in fact C4 is propagated at the same time as C2 and C3. The Boolean function for each output carry are expressed in sum of product form, thus they can implemented using AND-OR logic or NAND-NAND logic. Fig 2 shows implementation of Boolean functions for C2 , C3 and C4 using AND-OR logic. 3.3.6 LOGIC DIAGRAM OF A CARRY LOOK AHEAD GENERATOR - 9 - -
  • 10. ` DESIGN METHODOLOGY Using a look ahead carry generator we can easily construct a 4-bit parallel adder with a look ahead carry scheme. Fig 3 shows a 4-bit parallel adder with a look ahead carry scheme. As shown in the fig 3, each sum output requires two exclusive-OR gates. The output of the first exclusive-OR gate generates Pi , and the AND gate generates Gi . The carries are generated using look-ahead carry generator and applied as inputs to the second exclusive-OR gate. Other inputs to exclusive- OR gate generate sum output. Each output is generated after a delay of two levels of gate. Thus outputs S2 through S4 have equal propagation delay times. - 10 - - C4 C3 C2 C1 G1 G2 G3 P1 P2 P3
  • 12. ` DESIGN METHODOLOGY 3.4 BINARY SUBTRACTOR: The subtraction of unsigned binary numbers can be done most conveniently by means of complements. Remember that the subtraction A-B can be done by taking the 2’s complement of B and adding it to A. The 2’s complement can be obtained by taking the 1’s complement and adding one to the least significant pair of bits. The 1’s complement can be implemented with inverters and a one can be added to the sum through the input carry. The circuit for subtracting A-B consists of an adder with inverters placed between each data input B and the corresponding input of the full-adder. The input carry C0 must be equal to 1 when performing subtraction. The operation thus performed becomes A, plus the 1’s complement of B, plus 1. This is equal to A plus the 2’s complement of B. For unsigned numbers, this gives A-B if A>=B or the 2’s complement of (B-A) if A<B. For signed numbers, the result is A-B, provided that there is no overflow. The addition and subtraction operations can be combined into one circuit with one common binary adder. This is done by including an exclusive-OR gate receives input M and one of the inputs of B. When M=0, we have B+0=B. the full-adders receive the value of B, the input carry is 0,and the circuit performs A plus B, when M=1we have B+1=B’ and C0=1. The B inputs are all complemented and a is added through the input carry. The circuit performs the operation A plus the 2’s complement of B (The exclusive-OR with output V is for detecting an overflow). It is worth noting that binary numbers in the signed-complement system are added and subtracted by the same basic addition and subtraction rules as unsigned numbers. Therefore, computers need only one common hardware circuit to handle both types of arithmetic. The user or programmer must interpret the results of such addition or subtraction differently, depending on whether it is assumed that the numbers are signed or unsigned. 3.5 MULTIPLICATION Multiplication and division follow the same mathematical rules used in decimal numbering. However, their implementation is substantially more complex as compared to - 12 - -
  • 13. ` DESIGN METHODOLOGY addition and subtraction. Multiplication can be performed inside a computer in the same way that a person does so on paper. Consider 12 × 12 = 144. 1 2 X 1 2 2 4 Partial product × 100 + 1 2 Partial product × 101 1 4 4 Final product The multiplication process grows in steps as the number of digits in each multiplicand increases, because the number of partial products increases. Binary numbers function the same way, but there easily can be many partial products, because numbers require more digits to represent them in binary versus decimal. Here is the same multiplication expressed in binary (1100 × 1100 = 10010000): 1 1 0 0 X 1 1 0 0 0 0 0 0 Partial product × 20 0 0 0 0 Partial product × 21 1 1 0 0 Partial product × 22 + 1 1 0 0 Partial product × 23 1 0 0 1 0 0 0 0 Final product Walking through these partial products takes extra logic and time, which is why multiplication and, by extension, division are considered advanced operations that are not nearly as common as addition and subtraction. Methods of implementing these functions require trade-offs between logic complexity and the time required to calculate a final result. To see how a binary multiplier can be implemented with a combinational circuit, consider the multiplication of two 2-bit numbers as shown in figure. The multiplicand bits are B1 and B0, the multiplier bits are A1 and A0, and the product is C3 C2 C1 C0. The first partial product is formed by multiplying A0 by B1B0. The partial product can be implemented with AND gates as shown in the diagram. The second partial product is formed by multiplying A1 by B1 B0 and shifting one position to the left. The two partial products are added with two - 13 - -
  • 14. ` DESIGN METHODOLOGY half adder (HA) circuits. Usually there are more bits in the partial products and it is necessary to use full adders to produce the sum of the partial products. Note that the least significant bit of the product does not have to go through an adder since it is formed by the output of first AND gate. B1 B0 B1 B0 A1 A0 A0B1 A0B0 A1B1 A1B0 C3 C2 C1 C0 A combinational circuit binary multiplier with more bits can be constructed in a similar fashion. A of multiplier is ANDed with each bit of the multiplicand in as many levels as there are bits in the multiplier. The binary output in each level of AND gates are added with the partial products of the previous level to form a new partial product. The last level produces the product. For ‘J’ multiplier bits and ‘K’ multiplicands bits we need (J x K) AND gates and (J – 1) K- bit adders to produce a product of J + K bits. 3.6 COMPARATOR The comparison of two numbers is as operation that determines if one number is greater than, less than, or equal to the other number. A magnitude comparator is a combinational circuit that compares two numbers, A and B, and determines their relative - 14 - - HAHA A0 A1 B1 B0 B1 B0 C1 C0 C3 C2
  • 15. ` DESIGN METHODOLOGY magnitudes. The outcome of the comparison is specified by three binary variables that indicate whether A>B, A=B, or A<B. The circuit for comparing two n-bit numbers has 22n entries in the truth table and becomes too cumbersome even with n=3. On the other hand, as one may suspect, a comparator circuit possess a certain amount of regularity. Digital function that possesses an inherent well-defined regularity can usually be designed by means of an algorithmic procedure. An algorithm is a procedure that specifies a finite set of steps that, if followed, give the solution to the problem. We illustrate this method here by deriving an algorithm for the design of a 4-bit magnitude comparator. The algorithm is a direct application of the procedure a person uses to compare the relative magnitudes of two numbers. Consider two numbers, A and B, with four digits each. Write the coefficients of the numbers with descending significance. A=A3A2A1A0 B=B3B2B1B0 Each subscripted letter represents one of the digits in the number. The two numbers are equal if all pairs of significant digits are equal: A3 = B3 and A2 = B2 and A1 = B1 and A0 = B0 . When the numbers are binary, the digits are either 1 or 0, and the equality relation of each pair of bits can be expressed logically with an exclusive-OR function as Xi = AiBi for i = 0,1,2,3 Where xi =1 only if the pair of bits in position I are equal (i.e., if both are 1 or both are 0). The equality of two numbers, A and B, is displayed in a combinational circuit by an output binary variable that we designate by the symbol (A=B). This binary variable is equal to 1 if the input numbers, A and B, are equal, and it is equal to 0 otherwise. For the equality condition to exist, all xi variables must be equal to 1. This dictates an AND operation of all variables: (A=B)=x3x2x1x0 The binary variable (A=B) is equal to 1 only if all pairs of digits of the two numbers are equal. To determine if A is greater than or less than B, we inspect the relative magnitudes of pairs of significant digits starting from the most significant position. If the two digits are equal, we compare the next lower significant pair of digits. This comparison continues until a pair of unequal digits is reached. If the corresponding digit of A is 1 and that of B is 0, we conclude that A>B. if the corresponding digit of A is 0 and that of B is 1, we have that A<B. the sequential comparison can be expressed logically by the two Boolean functions - 15 - -
  • 16. ` DESIGN METHODOLOGY (A>B)=A3B’3 + x3A2B’2 + x3x2A1B’1 + x3x2x1A0B’0 (A<B)=A’3B3 + x3A’2B2 + x3 x2A’1B1 + x3x2x1A’0B0 The symbols (A>B) and (A<B) are binary output variables that are equal to 1 when A>B or A<B, respectively. 3.6.1. 4-BIT MAGNITUDE COMPARATOR The gate implementation of the three output variables just derived is simpler than it seems because it involves a certain amount of repetition. The unequal outputs can use the same gates that are needed to generate the equal output. The logic diagram of the 4-bit magnitude comparator is shown in fig 1. The four x outputs are generated with exclusive- NOR circuits and applied to an AND gate to give the output binary variable (A=B). The other two outputs use the x variables to generate the Boolean functions listed previously. This is a multilevel implementation and has a regular pattern. The procedure for obtaining magnitude comparator circuits for binary numbers with more than four bits is obvious from this example. - 16 - - A3 A2 A1 A0 B3 B2 B1 B0 X3 X2 X1 X0 (A< B) (A>B ) (A=B )
  • 17. ` DESIGN METHODOLOGY 3.7 LOGIC GATES Logic gates are the building blocks of digital electronics. The fundamental logic gates include the INVERT (NOT), AND, NAND, OR, exclusive OR (XOR), and exclusive NOR (XNOR) gates. Each of these gates performs a different logical operation. A description of what each logic gate does and switch and transistor analogy for each gate is disused here. 3.7.1 INVERTER (NOT) SYMBOL: TRUTH TABLE: ELECTRONIC IMPLEMENTATION OF INVERTER: NMOS Inverter PMOS Inverter Static CMOS Inverter Schematic of a Saturated-Load Digital Inverter DESCRIPTION:  Y=~A  A NOT gate or invertors. Output logic level is opposite to that of the input logic level - 17 - - INPU T OUTPUT A NOT A 0 1 1 0
  • 18. ` DESIGN METHODOLOGY 3.7.2 AND SYMBOL: TRUTH TABLE: ELECTRONIC IMPLEMENTATION OF INVERTER: DESCRIPTION:  ` Y=A & B.  The output of the AND gate is high only when both the inputs are high. - 18 - - INPU T OUTPUT A B A AND B 0 0 0 0 1 0 1 0 0 1 1 1
  • 19. ` DESIGN METHODOLOGY 3.7.3 OR SYMBOL: TRUTH TABLE: ELECTRONIC IMPLEMENTATION OF INVERTER: CMOS OR Gate DESCRIPTION:  Y= A | B  The output of the OR gate is high when one or both the inputs are high. - 19 - - INPU T OUTPUT A B A OR B 0 0 0 0 1 1 1 0 1 1 1 1
  • 20. ` DESIGN METHODOLOGY 3.7.4 XOR SYMBOL: TRUTH TABLE: ELECTRONIC IMPLEMENTATION OF INVERTER: DESCRIPTION:  OUT=A ^ B  The output of the XOR gate goes high if both the inputs are same. - 20 - - INPU T OUTPUT A B A XOR B 0 0 0 0 1 1 1 0 1 1 1 0
  • 21. ` DESIGN METHODOLOGY 3.7.5 XNOR SYMBOL: TRUTH TABLE: ELECTRONIC IMPLEMENTATION OF INVERTER: DESCRIPTION:  Y=A ^~ B  The output of the XOR gate goes high if both the inputs are different - 21 - - INPU T OUTPUT A B A XNOR B 0 0 1 0 1 0 1 0 0 1 1 1
  • 22. ` DESIGN METHODOLOGY - 22 - -
  • 24. MEMORY 4.1 MEMORY Since the dawn of the electronic era, memory or storage devices have been an integral part of electronic systems. As the electronic industry matured and moved away from vacuum tubes to semiconductor devices, research in the area of semiconductor memories also intensified. Semiconductor memory uses semiconductor-based integrated circuits to store information. The semiconductor memory industry evolved and prospered along with the digital computer revolution. Today, semiconductor memory arrays are widely used in many VLSI subsystems, including microprocessors and other digital systems. In these systems, they are used to store programs and data and in almost all cases have replaced core memory as the active main memory. More than half of the real estate in many state-of-the art microprocessors is devoted to cache memories, which are essentially semiconductor memory arrays. System designer’s (both hardware and software) unmitigated quest for more memory capacity has accelerated the growth of the semiconductor memory industry. One of the factors that determine a digital computer’s performance improvement is its ability to store and retrieve massive amounts of data quickly and inexpensively. Since the beginning of the computer age, this fact has led to the search for ideal memories. The ideal memory would be low cost, high performance, high density, with low-power dissipation, random access, nonvolatile, easy to test, highly reliable, and standardized throughout the industry. Unfortunately, a single memory having all these characteristics has not yet been developed, although each of the characteristics is held by one or another of the MOS memories. Today, MOS memories dominate the semiconductor memory market. 4.2. MEMORY CLASSIFICATION Semiconductor memories can be classified in many different ways. Semiconductor memories are generally classified based on the basic operation mode, nature of the data storage mechanism, access patterns, and the storage cell operation. Basic operation mode: Some memory circuits allow modification of information. In other words, we can read data from the memory and write new data into the memory, whereas other types of memory only allow reading of prewritten information. On the basis of this criterion, memories are classified into two major categories: Read=write memories (RWMs) and ROMs. RWMs are more popularly referred to as random access memories (RAMs). In the early days, RAMs were referred to by that name to contrast them with non semiconductor memories such as magnetic tapes that allow only sequential access. It should be noted that ROMs also allow random access the way RAMs do; however, they are not generally called RAMs. - 24 - -
  • 25. MEMORY Storage mode: On the basis of its ability to retain the stored information with respect to the ON=OFF state of the power supply, semiconductor memories can be classified into two types: volatile and nonvolatile memories. Volatile memory loses all the stored information once the power supply is turned OFF. RAM is an example of volatile memory. Nonvolatile memory, on the other hand, retains the stored information even when the power supply is turned OFF. ROMs and flash memories are examples of nonvolatile memories. Nonvolatile memories can be further divided into two categories: nonvolatile ROMs (e.g., mask-programmed ROM) and nonvolatile read–write memories (e.g., Flash, EPROM, and EEPROM) (Table). Table: Memory Classification Access patterns: On the basis of the order in which data can be accessed, memories can be classified into two different categories: RAMs and non-RAMs. Most memories belong to the random access class. In RAMs, information can be stored or retrieved in a random order at a fixed rate, independent of physical location. There are two kinds of RAMs: static random access memories (SRAMs) and dynamic random access memories (DRAMs). In SRAMs, data is stored in a latch and it retains the data written on the cell as long as the power supply to the memory is retained. In DRAMs, the data is stored in a capacitance as electric charge and the written data needs to be periodically refreshed to compensate for the charge leakage of the capacitance. It should be noted that both SRAM and DRAM are volatile memories, i.e., they lose the written information as soon as the power supply is turned OFF. Examples of non-RAMs are serial access memory (SAM) and content address memories (CAMs). SAM can be visualized as the opposite of RAM. SAM stores data as a series of memory cells that can only be accessed sequentially. If the data is not in the current location, each memory cell is checked until the needed data is found. SAM works very well for memory buffers, where the data is normally stored in the order in which it will be used. - 25 - -
  • 26. MEMORY Texture buffer memory on a video card is an example of SAM. In RAM, we give an address to the memory chip and we can retrieve the information stored in that particular address. But a CAM is designed such that when a data word (an assemblage of bits usually the width of the address bus) is supplied to the chip, the CAM searches its entire memory to see if that data word is stored anywhere in the chip. If the data word is found, the CAM returns a list of one or more storage addresses where the word was found and in some architectures, it also returns the data word. Finally, there needs to be a way to denote how much data can be stored by any particular memory device. This, fortunately for us, is very simple and straightforward: just count up the number of bits (or bytes, 1 byte = 8 bits) of total data storage space. Due to the high capacity of modern data storage devices, metric prefixes are generally affixed to the unit of bytes in order to represent storage space: 1.6 Gigabytes is equal to 1.6 billion bytes, or 12.8 billion bits, of data storage capacity. The only caveat here is to be aware of rounded numbers. Because the storage mechanisms of many random-access memory devices are typically arranged so that the number of "cells" in which bits of data can be stored appears in binary progression (powers of 2), a "one kilobyte" memory device most likely contains 1024 (2 to the power of 10) locations for data bytes rather than exactly 1000. A "64 kbyte" memory device actually holds 65,536 bytes of data (2 to the 16th power), and should probably be called a "66 Kbyte" device to be more precise. When we round numbers in our base-10 system, we fall out of step with the round equivalents in the base-2 system. One simple memory circuit is called the data latch, or D-latch. This is a device which, when “told” to do so via the clock input, notes the state of its input and holds that state at its output. The output state remains unchanged even if the input state changes, until another update request is received. Traditionally, the input of the D-latch is designated by D and the latched output by Q. The update command is provided by asserting the clock input in the form of a transition (from HI to LO) or (from LO to HI), so-called edge-triggered devices or level triggered devices, where the output follows the input whenever the clock is HI. - 26 - -
  • 27. MEMORY D-Latch Symbol and Truth Tables Data present on the input D is passed to the outputs Q and Q when the clock is asserted. The truth table for an edge-triggered D-latch is shown to the right of the schematic symbol. Some D-latches also have preset and Clear inputs that allow the output to be set HI or LO independent of the clock signal. In normal operation, these two inputs are pulled high so as not to interfere with the clocked logic. However, the outputs Q and Q can be initialized to a known state, using the Preset and Clear inputs when the clocked logic is not active. - 27 - -
  • 29. VERILOG 5.1 VERILOG In the semiconductor and electronic design industry, Verilog is a hardware description language (HDL) used to model electronic systems. Verilog HDL, not to be confused with VHDL, is most commonly used in the design, verification, and implementation of digital logic chips at the Register transfer level (RTL) level of abstraction. It is also used in the verification of analog and mixed-signal circuits 5.2 HISTORY OF VERILOG Beginning Verilog was invented by Phil Moorby and Prabhu Goel during the winter of 1983/1984 at Automated Integrated Design Systems (later renamed to Gateway Design Automation in 1985) as a hardware modeling language. Gateway Design Automation was later purchased by Cadence Design Systems in 1990. Cadence now has full proprietary rights to Gateway's Verilog and the Verilog-XL simulator logic simulators. Verilog-95 With the increasing success of VHDL at the time, Cadence decided to make the language available for open standardization. Cadence transferred Verilog into the public domain under the Open Verilog International (OVI) (now known as Accellera) organization. Verilog was later submitted to IEEE and became IEEE Standard 1364-1995, commonly referred to as Verilog-95. In the same time frame Cadence initiated the creation of Verilog-A to put standards support behind its analog simulator Spectre. Verilog-A was never intended to be a standalone language and is a subset of Verilog-AMS which encompassed Verilog-95. Verilog 2001 Extensions to Verilog-95 were submitted back to IEEE to cover the deficiencies that users had found in the original Verilog standard. These extensions became IEEE Standard 1364- 2001 known as Verilog-2001. Verilog-2001 is a significant upgrade from Verilog-95. First, it adds explicit support for (2's complement) signed nets and variables. Previously, code authors had to perform signed- operations using awkward bit-level manipulations (for example, the carry-out bit of a simple 8-bit addition required an explicit description of the boolean-algebra to determine its correct value.) The same function under Verilog-2001 can be more succinctly described by one of the built-in operators: +, -, /, *, >>>. A generate/endgenerate construct (similar to VHDL's generate/endgenerate) allows Verilog-2001 to control instance and statement instantiation
  • 30. VERILOG through normal decision-operators (case/if/else). Using generate/endgenerate, Verilog-2001 can instantiate an array of instances, with control over the connectivity of the individual instances. File I/O has been improved by several new system-tasks. And finally, a few syntax additions were introduced to improve code-readability (eg. always @*, named-parameter override, C-style function/task/module header declaration.) Verilog-2001 is the dominant flavor of Verilog supported by the majority of commercial EDA software packages. Verilog 2005 Not to be confused with SystemVerilog, Verilog 2005 (IEEE Standard 1364-2005) consists of minor corrections, spec clarifications, and a few new language features (such as the uwire keyword.) A separate part of the Verilog standard , Verilog-AMS, attempts to integrate analog and mixed signal modelling with traditional Verilog. SYSTEM VERILOG SystemVerilog is a superset of Verilog-2005, with many new features and capabilities to aid design-verification and design-modeling. The advent of High Level Verification languages such as OpenVera, and Verisity's E language encouraged the development of Superlog by Co-Design Automation Inc. Co-Design Automation Inc was later purchased by Synopsys. The foundations of Superlog and Vera were donated to Accellera, which later became the IEEE standard P1800-2005: SystemVerilog. 5.3 ABOUT LANGUAGE Hardware description languages, such as Verilog, differ from software programming languages in several fundamental ways. HDLs add the concept of concurrency, which is parallel execution of multiple statements in explicitly specified threads, propagation of time, and signal dependency (sensitivity). There are two assignment operators, a blocking assignment (=), and a non-blocking (<=) assignment. The non-blocking assignment allows designers to describe a state-machine update without needing to declare and use temporary storage variables. Since these concepts are part of the Verilog's language semantics, designers could quickly write descriptions of large circuits, in a relatively compact and concise form. At the time of Verilog's introduction (1984), Verilog represented a tremendous productivity
  • 31. VERILOG improvement for circuit designers who were already using graphical schematic-capture, and specially-written software programs to document and simulate electronic circuits. The designers of Verilog wanted a language with syntax similar to the C programming language, which was already widely used in engineering software development. Verilog is case-sensitive, has a basic preprocessor (though less sophisticated than ANSI C/C++), and equivalent control flow keywords (if/else, for, while, case, etc.), and compatible language operators precedence. Syntactic differences include variable declaration (Verilog requires bit- widths on net/reg types), demarcation of procedural-blocks (begin/end instead of curly braces {}), though there are many other minor differences. A Verilog design consists of a hierarchy of modules. Modules encapsulate design hierarchy, and communicate with other modules through a set of declared input, output, and bidirectional ports. Internally, a module can contain any combination of the following: net/variable declarations (wire, reg, integer, etc.), concurrent and sequential statement blocks and instances of other modules (sub-hierarchies). Sequential statements are placed inside a begin/end block and executed in sequential order within the block. But the blocks themselves are executed concurrently, qualifying Verilog as a Dataflow language. Verilog's concept of 'wire' consists of both signal values (4-state: "1, 0, floating, undefined"), and strengths (strong, weak, etc.) This system allows abstract modeling of shared signal-lines, where multiple sources drive a common net. When a wire has multiple drivers, the wire's (readable) value is resolved by a function of the source drivers and their strengths. A subset of statements in the Verilog language is synthesizable. Verilog modules that conform to a synthsizeable coding-style, known as RTL (register transfer level), can be physically realized by synthesis software. Synthesis-software algorithmically transforms the (abstract) Verilog source into a netlist, a logically-equivalent description consisting only of elementary logic primitives (AND, OR, NOT, flipflops, etc.) that are available in a specific VLSI technology. Further manipulations to the netlist ultimately lead to a circuit fabrication blueprint (such as a photo mask-set for an ASIC), or a bitstream-file for an FPGA) There are now two industry standard hardware description languages, VHDL and Verilog. The complexity of ASIC and FPGA designs has meant an increase in the number of specialist design consultants with specific tools and with their own libraries of macro and mega cells written in either VHDL or Verilog. As a result, it is important that designers know both VHDL and Verilog and that EDA tools vendors provide tools that provide an environment allowing both languages to be used in unison. For example, a designer might have a model of
  • 32. VERILOG a PCI bus interface written in VHDL, but wants to use it in a design with macros written in Verilog. VHDL (Very high speed integrated circuit Hardware Description Language) became IEEE standard 1076 in 1987. It was updated in 1993 and is known today as "IEEE standard 1076 1993". The Verilog hardware description language has been used far longer than VHDL and has been used extensively since it was launched by Gateway in 1983. Cadence bought Gateway in 1989 and opened Verilog to the public domain in 1990. It became IEEE standard 1364 in December 1995. There are two aspects to modeling hardware that any hardware description language facilitates; true abstract behavior and hardware structure. This means modeled hardware behavior is not prejudiced by structural or design aspects of hardware intent and that hardware structure is capable of being modeled irrespective of the design's behavior. 5.4 VHDL/VERILOG COMPARED & CONTRASTED This section compares and contrasts individual aspects of the two languages; they are listed in alphabetical order. Capability Hardware structure can be modeled equally effectively in both VHDL and Verilog. When modeling abstract hardware, the capability of VHDL can sometimes only be achieved in Verilog when using the PLI. The choice of which to use is not therefore based solely on technical capability but on: personal preferences EDA tool availability commercial, business and marketing issues The modeling constructs of VHDL and Verilog cover a slightly different spectrum across the levels of behavioral abstraction; see Figure 1. HDL modeling capability
  • 33. VERILOG COMPILATION VHDL, Multiple design-units (entity/architecture pairs), that resides in the same system file may be separately compiled if so desired. However, it is good design practice to keep each design unit in it's own system file in which case separate compilation should not be an issue. The Verilog language is still rooted in it's native interpretative mode. Compilation is a means of speeding up simulation, but has not changed the original nature of the language. As a result care must be taken with both the compilation order of code written in a single file and the compilation order of multiple files. Simulation results can change by simply changing the order of compilation. DATA TYPES Verilog when Compared to VHDL, Verilog data types a re very simple, easy to use and very much geared towards modeling hardware structure as opposed to abstract hardware modeling. Unlike VHDL, all data types used in a Verilog model are defined by the Verilog language and not by the user. There are net data types, for example wire, and a register data type called reg. A model with a signal whose type is one of the net data types has a corresponding electrical wire in the implied modeled circuit. Objects, those are signals, of type reg hold their value over simulation delta cycles and should not be confused with the modeling of a hardware register. Verilog may be preferred because of it's simplicity. Design reusability Verilog, There is no concept of packages in Verilog. Functions and procedures used within a model must be defined in the module. To make functions and procedures generally accessible from different module statements the functions and procedures must be placed in a separate system file and included using the `include compiler directive. Easiest to Learn Starting with zero knowledge of either language, Verilog is probably the easiest to grasp and understand. This assumes the Verilog compiler directive language for simulation and the PLI language is not included. If these languages are included they can be looked upon as two additional languages that need to be learned. VHDL may seem less intuitive at first for two primary reasons. First, it is very strongly typed; a feature that makes it robust and powerful for the advanced user after a longer learning phase. Second, there are many ways to model the same circuit, especially those with large hierarchical structures. Forward and back annotation A spin-off from Verilog is the Standard Delay Format (SDF). This is a general purpose format used to define the timing delays in a circuit. The format provides a bidirectional link
  • 34. VERILOG between, chip layout tools, and either synthesis or simulation tools, in order to provide more accurate timing representations. The SDF format is now an industry standard in it's own right. High level constructs Verilog. Except for being able to parameterize models by overloading parameter constants, there is no equivalent to the high-level VHDL modeling statements in Verilog. LANGUAGE EXTENSIONS The use of language extensions will make a model non standard and most likely not portable across other design tools. However, sometimes they are necessary in order to achieve the desired results. Verilog The Programming Language Interface (PLI) is an interface mechanism between Verilog models and Verilog software tools. For example, a designer, or more likely, a Verilog tool vendor, can specify user defined tasks or functions in the C programming language, and then call them from the Verilog source description. Use of such tasks or functions make a Verilog model nonstandard and so may not be usable by other Verilog tools. Their use is not recommended. Libraries Verilog. There is no concept of a library in Verilog. This is due to it's origins as an interpretive language. Low Level Constructs Verilog. The Verilog language was originally developed with gate level modeling in mind, and so has very good constructs for modeling at this level and for modeling the cell primitives of ASIC and FPGA libraries. Examples include User Defined Primitive s (UDP), truth tables and the specify block for specifying timing delays across a module. Managing large designs Verilog. There are no statements in Verilog that help manage large designs. Operators The majority of operators are the same between the two languages. Verilog does have very useful unary reduction operators that are not in VHDL. A loop statement can be used in VHDL to perform the same operation as a Verilog unary reduction operator. VHDL has the mod operator that is not found in Verilog. Parameterizable models Verilog. A specific width model can be instantiated from a generic n-bit model using overloaded parameter values. The generic model must have a default parameter value defined. This means two things. In the absence of an overloaded value being specified, it will
  • 35. VERILOG still synthesize, but will use the specified default parameter value. Also, it does not need to be instantiated with an overloaded parameter value specified, before it will synthesize. Procedures and tasks VHDL allows concurrent procedure calls; Verilog does not allow concurrent task calls. Readability This is more a matter of coding style and experience than language feature. VHDL is a concise and verbose language; its roots are based on Ada. Verilog is more like C because it's constructs are based approximately 50% on C and 50% on Ada. For this reason an existing C programmer may prefer Verilog over VHDL. Although an existing programmer of both C and Ada may find the mix of constructs somewhat confusing at first. Whatever HDL is used, when writing or reading an HDL model to be synthesized it is important to think about hardware intent. Structural replication Verilog. There is no equivalent to the generate statement in Verilog. Test harnesses Designers typically spend about 50% of their time writing synthesizable models and the other 50% writing a test harness to verify the synthesizable models. Test harnesses are not restricted to the synthesizable subset and so are free to use the full potential of the language. VHDL has generic and configuration statements that are useful in test harnesses, that are not found in Verilog. Verboseness Verilog. Signals representing objects of different bits widths may be assigned to each other. The signal representing the smaller number of bits is automatically padded out to that of the larger number of bits, and is independent of whether it is the assigned signal or not. Unused bits will be automatically optimized away during the synthesis process. This has the advantage of not needing to model quite so explicitly as in VHDL, but does mean unintended modeling errors will not be identified by an analyzer.
  • 37. CADENCE 6.1 CADENCE TOOLS The Cadence suite is a huge collection of programs for different CAD applications from VLSI design to high-level DSP programming. The suite is divided into different “packages,” and for VLSI design, the packages we will be using are the IC package and the DSMSE package. The Cadence toolset is a complete microchip EDA system, which is intended to develop professional, full-scale, mixed-signal microchips and breadboards. The modules included in the toolset are for schematic entry, design simulation, data analysis, physical layout, and final verification. The strength of the Cadence tools is in its analog design/simulation/layout and mixed signal verification and is often used in tandem with other tools for RF and/or digital design/simulation/layout, where complete top-level verification is done in the Cadence tools. Another important concept is that the Cadence tools only provide a framework for doing design. Without a foundry-provided design kit, no design can be done. Cadence Design Systems, Inc. (NASDAQ: CDNS), the leader in global electronic-design innovation, today said Global Unichip Corporation (GUC), a leading system-on-chip (SoC) design foundry, is the first Taiwan-based design company to complete a successful tape out of a 65-nanometer device. The success of this 65-nanometer tape out further strengthened GUC's advanced technology capabilities to serve the top tier customers worldwide. GUC used the Cadence(R) Low-Power Solution and SoC Encounter(TM) GXL RTL-to-GDSII system to achieve the tape out. 6.2 ABOUT CADENCE COMPANY Cadence enables global electronic-design innovation and plays an essential role in the creation of today's integrated circuits and electronics. Customers use Cadence software and hardware, methodologies, and services to design and verify advanced semiconductors, consumer electronics, networking and telecommunications equipment, and computer systems. Cadence reported 2006 revenues of approximately $1.5 billion, and has approximately 5,200 employees. The company is headquartered in San Jose, Calif., with sales offices, design centers, and research facilities around the world to serve the global electronics Since ours is digital designing the tools used in our project are:
  • 38. CADENCE  IUS - Incisive Unified Simulator.  RC-RTL Compiler.  SOC encounter-System On Chip encounter. This tool work on 18nanometer technology. Now, we will study in detail about these tools and results of our project using these tools. FLOW OF DESIGN USING CADENCE TOOLS
  • 39. 6.3 INCISIVE UNIFIED SIMULATOR Incisive Unified Simulator is a tool used to simulate digital circuits. The designs are represented using many different languages such as Verilog or VHDL. IUS supports those language as well additional languages used for specialized verification functions, such as SystemC, a derivative of C++. The tool handles any design that can be represented using a digital representation with the key languages. The Verilog only environment is called NC-Verilog and the VHDL one is called NC-VHDL. Designers depending on the complexity of their simulation tasks will create environments that use multiple languages to perform advanced verification tasks. Who needs IUS:–System architects who need to do analysis on various scenarios to determine what the right grouping of components would be. This is typically done with simple IP models to look at high level behavior.–Design engineers who are creating the various parts of the circuit use IUS to test the behavior and make sure the requirements are met–Verification Engineers are a specialized team that take the design once it is completed and create test that exercise the complete design testing actual conditions as best as possible. –IP vendors used IUS to create IP models and ensure that their models behaves correctly with the tools that their customers will use.–Board designers will use IUS as means BENEFITS • Speeds time-to-market with lower risk and higher predictability. • Increases productivity by enabling verification to start months earlier, before test bench development and simulation. • Improves quality and reduces risk of re-spins by exposing corner-case functional bugs that are difficult or impossible to find using conventional methods. • Reduces block design effort and debug time, and shortens integration time . • Provides design teams with an advanced debug environment with simulation synergies for ease-of adoption. • Offers the ultimate simulation-based speed and efficiency. • Increases RTL performance by 100 times with native transaction-level simulation and optional Acceleration- on-Demand. • Reduces test bench development up to 50% with transaction-level support, unified test generation, and verification component re-use. • Shortens verification time, finds bugs faster, and eliminates exhaustive simulation runs with dynamic assertion checking. • Decreases debug time up to 25% through unified transaction/signal viewing, HDL analysis capability, and unified debug environment for all languages.
  • 40. This program is a front-end to some of the other tools in this directory. Its job is to compile, elaborate, and launch the simulation. ncvlog This is the Verilog compiler. Typing this command in with no arguments gives you a listing of the possible options. Two arguments which are useful are -cdslib, which specifies the location of your cds.lib file, and -work, which specifies the location of your worklib file. To compile a Verilog file named test.v, and its test bench named tb_test.v, using a cdslib of cds.lib, and a work library of worklib, you can execute the following command: ncvlog -cdslib cds.lib -work worklib test.v tb_test.v ncelab This is the elaborator. Again, typing this command with no arguments outputs a list of options. The above -cdslib and -work arguments apply. To elaborate an compiled test bench called tb_test.v, execute the following command: ncelab -cdslib cds.lib -work worklib worklib.tb_test_v ncsim This is the actual simulator. To launch a compiled and elaborated test bench, execute the following command: ncsim -gui -cdslib cds.lib -work worklib worklib.tb_test_v:module
  • 42. `resetall `timescale 1 ns / 1 ns `view module cu (inp1, inp2, s, refclk, result, carry); input [15:0] inp1, inp2; input [3:0] s; input refclk; output [31:0] result; output carry; reg [3:0] opc; reg rw, um; wire [31:0] res; wire car, clk, spi_clk; wire [15:0] data; assign result = res; assign carry = car; alu1 alu_inst (inp1, inp2, opc, rw, um, res, car); always @ (posedge refclk or rw or um or opc) begin opc = { s[3], s[2], s[1], s[0]}; if (um == 1) begin rw = 1; end else if (um == 0) begin rw = 1; end end endmodule `noview MODULE – ALU `resetall `timescale 1 ns / 1 ns `view module alu (in1, in2, s, rw, um, out, cout); input [15:0] in1,in2; input [3:0] s; input rw , um; output [31:0] out; output cout; wire [31:0] out; wire [15:0] out22, out33, out77; wire [31:0] out88; wire cout1;
  • 43. wire [15:0] datout; wire [1:0] ss; reg [15:0] inn1, inn2, int, int1; reg [15:0] out1,out3,out4,out5,out6,out7,out8,out9,out10; reg [31:0] out2; reg [15:0] m1,m2; reg cc, en, e, enn; assign out = s[3] ? ( s[2] ? (s[1] ? (s[0] ? out10 : out9): out8 ): out8): ( s[2] ? ( s[1] ? ( s[0] ? out7:out6 ): ( s[0] : out5: out4)): ( s[1] ? (s[0] ? out3 : out2) : out1)); assign cout = cout1; add16 adder16bit_inst (in1, inn2, cc, enn, out22, cout1); mul multiplication_inst (m1, m2, enn, out88); cmpr comparator_inst (in1, int, e, ss, out77); mem memory_inst (in2, in1, rw, en, datout); always @ ( in1 or in2 or s or rw) begin en = 1; if (um == 1); int = datout; else int = in2; inn2 <= {16{ s[0]}} ^ int; cc <= s[0]; e <= 1; m1 = { in1[15], in1[14], in1[13] , in1[12], in1[11], in1[10], in1[9], in1[8], in1[7], in1[6], in1[5], in1[4], in1[3], in1[2], in1[1], in1[0]}; m2 = {int[15], int[14], int[13], int[12], int[11], int[10], int[9], int[8], int[7], int[6], int[5], int[4], int[3], int[2], int[1], int[0]}; enn <= 1; out6 = ~in1; out2 = out88; out3 = in1 | int; out1 = out22; out4 = in1 & int; out5 = in1 ^ int; out7 = out33; out8 = out77; out9 = in1; out10 = int; end endmodule `noview MODULE – 16 BIT ADDER
  • 44. `resetall `timescale 1 ns / 1 ns `view module add16(a,b,c_in,Enn,sum,cout); input [15:0] a,b; input c_in; input Enn; output [15:0] sum; output cout; reg [15:0] p,g; reg [16:0] carry; reg [15:0]s; integer i; assign sum = s; assign cout= carry[16]; always @ (Enn) begin p= (a ^ b); g = (a & b); carry[0] = c_in; carry[1] = (g[0] | (p[0] & c_in)); carry[2] = (g[1] | (p[1] & g[0]) | (p[1] & p[0] & c_in)); carry[3] = (g[2] | (p[2] & g[1]) | (p[2] & p[1] & g[0]) | ( p[2] & p[1] & p[0] & c_in )); carry[4] = (g[3] | (p[3] & g[2]) | (p[3] & p[2] & g[1]) | (p[3] & p[2] & p[1] & g[0]) | ( p[3] & p[2] & p[1] & p[0] & c_in)); carry[5] = (g[4] | (p[4] & g[3]) | (p[4] & p[3] & g[2]) | (p[4] & p[3] & p[2] & g[1]) | (p[4] & p[3] & p[2] & p[1] & g[0]) | (p[4] & p[3] & p[2] & p[1] & p[0] & c_in)); carry[6] = (g[5] | (p[5] & g[4]) | (p[5] & p[4] & g[3]) | (p[5] & p[4] & p[3] & g[2]) | (p[5] & p[4] & p[3] & p[2] & g[1]) | (p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in)); carry[7] = (g[6] | (p[6] & g[5]) | (p[6] & p[5] & g[4]) | (p[6] & p[5] & p[4] & g[3]) | (p[6] & p[5] & p[4] & p[3] & g[2]) | (p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in)); carry[8] = (g[7] | (p[7] & g[6]) | (p[7] & p[6] & g[5]) | ( p[7] & p[6] & p[5] & g[4]) | ( p[7] & p[6] & p[5] & p[4] & g[3]) | (p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));
  • 45. carry[9] = (g[8] | (p[8] & g[7]) | ( p[8] & p[7] & g[6]) | (p[8] & p[7] & p[6] & g[5]) | (p[8] & p[7] & p[6] & p[5] & g[4]) | (p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | (p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in)); carry[10] =( g[9] | (p[9] & g[8]) | ( p[9] & p[8] & g[7]) | (p[9] & p[8] & p[7] & g[6]) | (p[9] & p[8] & p[7] & p[6] & g[5]) | (p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | (p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[4] & p[3] & p[2] & g[1]) | (p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in)); carry[11] = (g[10] | (p[10] & g[9]) | (p[10] & p[9] & g[8]) | (p[10] & p[9] & p[8] & g[7]) | ( p[10] & p[9] & p[8] & p[7] & g[6]) | (p[10] & p[9] & p[8] & p[7] & p[6] & g[5]) | (p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | (p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in)); carry[12] = (g[11] | (p[11] & g[10]) | (p[11] & p[10] & g[9]) | (p[11] & p[10] & p[9] & g[8]) | ( p[11] & p[10] & p[9] & p[8] & g[7]) | (p[11] & p[10] & p[9] & p[8] & p[7] & g[6]) | ( p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & g[5]) | (p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | ( p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | ( p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in)); carry[13] = (g[12] | (p[12] & g[11]) | ( p[12] & p[11] & g[10] ) | (p[12] & p[11] & p[10] & g[9] ) | ( p[12] & p[11] & p[10] & p[9] & g[8]) | (p[12] & p[11] & p[10] & p[9] & p[8] & g[7] ) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & g[6] ) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & g[5]) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | ( p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p [6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in)); carry[14] = (g[13] | (p[13] & g[12] ) | (p[13] & p[12] & g[11]) | (p[13] & p[12] & p[11] & g[10]) | (p[13] & p[12] & p[11] & p[10] & g[9]) | (p[13] & p[12] & p[11] & p[10] & p[9] & g[8]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & g[7]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & g[6]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & g[5]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | (p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in));
  • 46. carry[15] = (g[14] | (p[14] & g[13]) | (p[14] & p[13] & g[12] ) | (p[14] & p[13] & p[12] & g[11]) | (p[14] & p[13] & p[12] & p[11] & g[10]) | ( p[14] & p[13] & p[12] & p[11] & p[10] & g[9] ) | ( p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & g[8]) | ( p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & g[7]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & g[6] ) | ( p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & g[5]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[ 6] & p[5] & p[4] & g[3]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p [3] & p[2] & p[1] & g[0]) | (p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in)); carry[16] = (g[15] | (p[15] & g[14]) | (p[15] & p[14] & g[13]) | (p[15] & p[14] & p[13] & g[12]) | (p[15] & p[14] & p[13] & p[12] & g[11]) | (p[15] & p[14] & p[13] & p[12] & p[11] & g[10]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & g[9]) | ( p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & g[8]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & g[7]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & g[6]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & g[5]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & g[4]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & g[3]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & g[2]) | (p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & g[1]) | ( p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & g[0]) | ( p[15] & p[14] & p[13] & p[12] & p[11] & p[10] & p[9] & p[8] & p[7] & p[6] & p[5] & p[4] & p[3] & p[2] & p[1] & p[0] & c_in)); for( i =0 ;i< 16 ; i=i+1) s[i] =(p[i] ^ carry[i]); end endmodule MODULE – MULTIPLICATION `resetall `timescale 1 ns / 1 ns `view module mul(a,b,mult); input [15:0] a,b; output [31:0] mult; integer i,j; reg [15:0] prod1,prod,sum; reg carry; reg [31:0] m; assign mult = m; always @ (a or b) begin carry = 0;
  • 47. for (j=0;j<16;j=j+1) prod1[j]= a[0]& b[j]; m[0] = prod1[0]; prod1 = prod1 >> 1; for (i=1;i<16;i=i+1) begin for (j = 0;j<16;j =j+1) prod[j] = a[i] & b[j]; for(j=0;j<16;j=j+1) begin sum[j] = prod[j] ^ prod1[j] ^ carry; carry = ( carry & prod[j]) | ( carry & prod1[j])| ( prod[j] & prod1[j]); end m[i]= sum [0]; prod1 = sum >> 1; prod1[15] = carry; end for(i=0;i<16;i=i+1) begin m[i+16] = prod1[i]; end end endmodule MODULE - COMPARATOR `resetall `timescale 1 ns/ 1 ns `view module cmpr(a,b,en,cmp,grt); input [15:0] a,b; input en; output [1:0] cmp; output [15:0] grt; reg [15:0] x,ar,br,grtr; integer i; reg eq,gr,ls; reg [15:0] c,d; reg [1:0] compr; assign cmp = compr; assign grt = grtr; always @ (en) begin ar = a; br = b; for ( i= 15; i > -1; i = i-1) begin x[i] = ( a[i] & b[i]) | (( ~a[i]) & (~b[i]));
  • 48. c[i] = ~b[i]; d[i] = ~a[i]; end eq = x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & x[3] & x[2] & x[1] & x[0]; gr = (a[15] & c[15]) | (x[15] & a[14] & c[14]) | (x[15] & x[14] & a[13] & c[13]) | (x[15] & x[14] & x[13] & a[12] & c[12]) | ( x[15] & x[14] & x[13] & x[12] & a [11] & c [11]) | (x[15] & x[14] & x[13] & x[12] & x[11] & a[10] & c[10]) | ( x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & a[9] & c[9]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & a[8] & c[8]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & a [7] & c[7]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & a[6] & c[6]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & a[5] & c[5]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & a[4] & c[4]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & a[3] & c[3]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] &x[5] & x[4] & x[3] & a[2] & c[2]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & x[3] & x[2] & a[1] & c[1]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & x[3] & x[2] &x[1] & a[0] & c[0]); ls = (d[15] & b[15]) | (x[15] & d[14] & b[14]) | (x[15] & x[14] & d[13] & b[13]) | (x[15] & x[14] & x[13] & d[12] & b[12]) | ( x[15] & x[14] & x[13] & x[12] & d [11] & b [11]) | (x[15] & x[14] & x[13] & x[12] & x[11] & d[10] & b[10]) | ( x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & d[9] & b[9]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & d[8] & b[8]) | (x[15] & x[14] &x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & d [7] & b[7]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7]& d[6] & b[6]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & d[5] & b[5]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & d[4] & b[4]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & d[3] & b[3]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] &x[5] & x[4] & x[3] & d[2] & b[2]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & x[3] & x[2] & d[1] & b[1]) | (x[15] & x[14] & x[13] & x[12] & x[11] & x[10] & x[9] & x[8] & x[7] & x[6] & x[5] & x[4] & x[3] & x[2] &x[1] & d[0] & b[0]); if (eq == 1) begin compr = 2'b00; grtr = ar; end else if (gr == 1) begin compr = 2'b01; grtr = ar; end else if ( ls == 1) begin compr = 2'b10; grtr = br; end else compr = 2'b11; end endmodule
  • 49. MODULE – MEMORY `resetall `timescale 1 ns/ 1 ns `view module mem(addr,datain,rw,en,dataout); input [15:0] datain,addr; output [15:0] dataout; input rw,en; reg [65535:0] mem1 [15:0]; reg [15:0] dat; assign dataout = dat; always @ (en) begin if(rw==1) dat = mem1[addr]; else if(rw==0) begin mem1[addr] = datain; dat = datain; end dat= datain; end endmodule `noview