The document describes the design methodology for an ALU chip controller. It discusses using a carry look-ahead adder to speed up addition and subtraction. The ALU can perform various arithmetic (addition, subtraction, multiplication) and logical (AND, OR, XOR) operations. It uses a combinational logic design with multiplexers to select the output. The block diagram shows the main components are a control unit, 16-bit ALU, and memory. The control unit provides signals to control the ALU operations.
This presentation was VLSI I laboratory project, which was the most painful, yet the most satisfying, the most challenging, yet the most entertaining, the most tiresome, yet the most amusing and maybe the most memorable project of my BUET life, with the most talented mind I have ever seen, Naimul Hassan, (of course, he did almost all of the work, i just volunteered). The presentation contains only raw information about our work and the cells and schematics of Cadence, but it surely missed the enormous memories behind it -- the sleepless nights, hours and hours in front of PC, confusing simulation results, confusing errors, recursive DRC and LVS errors, and after the completion, me and Naim, hugging each other and crying with happiness. Surely this was the most memorable project of my life till now (and if I don't get VLSI II, this would be the most memorable project of my entire life). A really special thanks to Dr. A. B. M. Harun-Ur-Rashid Sir, Professor, Department of Electrical and Electronic Engineering, BUET and Kanak Datta Sir, Lecturer, Department of Electrical and Electronic Engineering, BUET, for assigning us the project in order to get a good practice of the most valuable software in the world Cadence.
Design and implementation of high speed baugh wooley and modified booth multi...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
This presentation was VLSI I laboratory project, which was the most painful, yet the most satisfying, the most challenging, yet the most entertaining, the most tiresome, yet the most amusing and maybe the most memorable project of my BUET life, with the most talented mind I have ever seen, Naimul Hassan, (of course, he did almost all of the work, i just volunteered). The presentation contains only raw information about our work and the cells and schematics of Cadence, but it surely missed the enormous memories behind it -- the sleepless nights, hours and hours in front of PC, confusing simulation results, confusing errors, recursive DRC and LVS errors, and after the completion, me and Naim, hugging each other and crying with happiness. Surely this was the most memorable project of my life till now (and if I don't get VLSI II, this would be the most memorable project of my entire life). A really special thanks to Dr. A. B. M. Harun-Ur-Rashid Sir, Professor, Department of Electrical and Electronic Engineering, BUET and Kanak Datta Sir, Lecturer, Department of Electrical and Electronic Engineering, BUET, for assigning us the project in order to get a good practice of the most valuable software in the world Cadence.
Design and implementation of high speed baugh wooley and modified booth multi...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
A high speed dynamic ripple carry addereSAT Journals
Abstract Adder, which is one of the basic building blocks of a processor affect the performance of the processor. There are many adder architectures each of them have their own advantage. Ripple Carry Adder (RCA) architecture occupies the minimum area among the other architectures with lesser power dissipation. RCA experiences more delay due to its carry propagation in critical path; apart from the delay it also experiences glitches. Constant delay (CD) logic solves both the delay problems and glitch related problems. CD logic, due to its pre-evaluated characteristics delivers high speed but due its bulkier nature it is used only in the critical path. In this paper two new techniques are presented which modifies the conventional timing block (requires ten transistors) in CD logic and two new timing blocks one with eight transistors and other with nine transistors are developed. The CD logic with the two new timing block is used in critical path of RCA to achieve higher speed performance with lesser area compared to conventional CD logic. The CD logic with 9-transistor timing block achieves 70% and 39% delay reduction compared to Static and Domino logics. It also achieves 21% and 5% reduction in power dissipation and delay. The 8-transistor version also achieves reduction of delay by 65% and 29% compared to Static and dynamic logic. The two versions of timing blocks have their own advantages where 9-transistor version provides high speed and 8- transistor version provides lesser power dissipation. Simulations are carried out in 130 nm at 1V power supply using mentor graphics tools. Key Words: Critical Path, Feed Through Logic, Constant Delay logic, Pre-evaluated logic, and Timing block.
Cost Efficient Design of Reversible Adder Circuits for Low Power ApplicationsVIT-AP University
A large amount of research is currently going on in the field
of reversible logic, which have low heat dissipation, low
power consumption, which is the main factor to apply
reversible in digital VLSI circuit design.This paper introduces
reversible gate named as ‘Inventive0 gate’. The novel gate is
synthesis the efficient adder modules with minimum garbage
output and gate count. The Inventive0 gate capable of
implementing a 4-bit ripple carry adder and carry skip adders.
It is presented that Inventive0 gate is much more efficient and
optimized approach as compared to their existing design, in
terms of gate count, garbage outputs and constant inputs. In
addition, some popular available reversible gates are
implemented in the MOS transistor design the implementation
kept in mind for minimum MOS transistor count and are
completely reversible in behaviour more precise forward and
backward computation. Lesser architectural complexity show
that the novel designs are compact, fast as well as low power.
FPGA Implementation of Pipelined CORDIC Sine Cosine Digital Wave Generator cscpconf
The coordinate rotation digital computer (CORDIC) algorithm is well known iterative
algorithm for performing rotations in digital signal processing applications. Hardware
implementation of CORDIC results increase in Critical path delay. Pipelined architecture isused in CORDIC to increase the clock speed and to reduce the Critical path delay. In this paper a hardware efficient Digital sine and cosine wave generator is designed and implemented using Pipelined CORDIC architecture. FPGA based architecture is presented and design has been implemented using Xilinx 12.3 device
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Design of Complex Adders and Parity Generators Using Reversible GatesIJLT EMAS
This paper shows efficient design of an odd and even parity generator, a 4-bit ripple carry adder, and a 2-bit carry look ahead adder using reversible gates. Number of reversible gates used, garbage output, and percentage usage of outputs in implementing each combinational circuit is derived. The CLA used 10 reversible gates with 14 garbage outputs, with 50% percentage performance usage.
Hardware Implementation of Two’s Compliment Multiplier with Partial Product b...IJERA Editor
With the emergence of portable computing and communication systems, power consumption has become one of the major objectives during VLSI design. Furthermore, the multiplication is an essential arithmetic operation for common DSP applications, such as filtering, convolution, fast Fourier Transform (FFT) etc. To achieve high execution speed, parallel array multipliers are widely used. These multipliers tend to consume most of the power in DSP computations, and thus power-efficient multipliers are very important for the design of low-power DSP systems. This paper presents an approach to reduce power consumption of 2’s compliment multiplier design, in which switching activities are reduced through dynamic by passing of partial products.
IOSR journal of VLSI and Signal Processing (IOSRJVSP) is a double blind peer reviewed International Journal that publishes articles which contribute new results in all areas of VLSI Design & Signal Processing. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced VLSI Design & Signal Processing concepts and establishing new collaborations in these areas.
Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels
A high speed dynamic ripple carry addereSAT Journals
Abstract Adder, which is one of the basic building blocks of a processor affect the performance of the processor. There are many adder architectures each of them have their own advantage. Ripple Carry Adder (RCA) architecture occupies the minimum area among the other architectures with lesser power dissipation. RCA experiences more delay due to its carry propagation in critical path; apart from the delay it also experiences glitches. Constant delay (CD) logic solves both the delay problems and glitch related problems. CD logic, due to its pre-evaluated characteristics delivers high speed but due its bulkier nature it is used only in the critical path. In this paper two new techniques are presented which modifies the conventional timing block (requires ten transistors) in CD logic and two new timing blocks one with eight transistors and other with nine transistors are developed. The CD logic with the two new timing block is used in critical path of RCA to achieve higher speed performance with lesser area compared to conventional CD logic. The CD logic with 9-transistor timing block achieves 70% and 39% delay reduction compared to Static and Domino logics. It also achieves 21% and 5% reduction in power dissipation and delay. The 8-transistor version also achieves reduction of delay by 65% and 29% compared to Static and dynamic logic. The two versions of timing blocks have their own advantages where 9-transistor version provides high speed and 8- transistor version provides lesser power dissipation. Simulations are carried out in 130 nm at 1V power supply using mentor graphics tools. Key Words: Critical Path, Feed Through Logic, Constant Delay logic, Pre-evaluated logic, and Timing block.
Cost Efficient Design of Reversible Adder Circuits for Low Power ApplicationsVIT-AP University
A large amount of research is currently going on in the field
of reversible logic, which have low heat dissipation, low
power consumption, which is the main factor to apply
reversible in digital VLSI circuit design.This paper introduces
reversible gate named as ‘Inventive0 gate’. The novel gate is
synthesis the efficient adder modules with minimum garbage
output and gate count. The Inventive0 gate capable of
implementing a 4-bit ripple carry adder and carry skip adders.
It is presented that Inventive0 gate is much more efficient and
optimized approach as compared to their existing design, in
terms of gate count, garbage outputs and constant inputs. In
addition, some popular available reversible gates are
implemented in the MOS transistor design the implementation
kept in mind for minimum MOS transistor count and are
completely reversible in behaviour more precise forward and
backward computation. Lesser architectural complexity show
that the novel designs are compact, fast as well as low power.
FPGA Implementation of Pipelined CORDIC Sine Cosine Digital Wave Generator cscpconf
The coordinate rotation digital computer (CORDIC) algorithm is well known iterative
algorithm for performing rotations in digital signal processing applications. Hardware
implementation of CORDIC results increase in Critical path delay. Pipelined architecture isused in CORDIC to increase the clock speed and to reduce the Critical path delay. In this paper a hardware efficient Digital sine and cosine wave generator is designed and implemented using Pipelined CORDIC architecture. FPGA based architecture is presented and design has been implemented using Xilinx 12.3 device
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Design of Complex Adders and Parity Generators Using Reversible GatesIJLT EMAS
This paper shows efficient design of an odd and even parity generator, a 4-bit ripple carry adder, and a 2-bit carry look ahead adder using reversible gates. Number of reversible gates used, garbage output, and percentage usage of outputs in implementing each combinational circuit is derived. The CLA used 10 reversible gates with 14 garbage outputs, with 50% percentage performance usage.
Hardware Implementation of Two’s Compliment Multiplier with Partial Product b...IJERA Editor
With the emergence of portable computing and communication systems, power consumption has become one of the major objectives during VLSI design. Furthermore, the multiplication is an essential arithmetic operation for common DSP applications, such as filtering, convolution, fast Fourier Transform (FFT) etc. To achieve high execution speed, parallel array multipliers are widely used. These multipliers tend to consume most of the power in DSP computations, and thus power-efficient multipliers are very important for the design of low-power DSP systems. This paper presents an approach to reduce power consumption of 2’s compliment multiplier design, in which switching activities are reduced through dynamic by passing of partial products.
IOSR journal of VLSI and Signal Processing (IOSRJVSP) is a double blind peer reviewed International Journal that publishes articles which contribute new results in all areas of VLSI Design & Signal Processing. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced VLSI Design & Signal Processing concepts and establishing new collaborations in these areas.
Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels
Design Of 64-Bit Parallel Prefix VLSI Adder For High Speed Arithmetic CircuitsIJRES Journal
Parallel prefix adder is a kind of process for speeding up the addition of the system of writing and calculating with numbers which use only two digits. Parallel prefix adders are also known as carry-tree adders and they are known to have the best performance in VLSI designs. Due to constraints on logic blog configurations a routing overhead, this performance advantage does not translate directly into FPGA implementations. Identifying the absolutely accurate area-delay tradeoff curve of the parallel prefix is an interesting problem that has received more attention in research because parallel prefix adder on the other hand represents a type of general adder structure that displays publically in flexible area-time tradeoffs for the design of adder. Many different types of parallel prefix adders are made to increase for optimizing area, fan out, speed and performance. For high speed performance tree like structure is must which helps in greater way. There are many different method used for designing parallel prefix adder based on their speed, size and performance. For area optimization we use Brent-Kung method. If our main purpose is to get the least timing then we have to use Kogg-Stone adder method.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Implementation of FinFET technology based low power 4×4 Wallace tree multipli...TELKOMNIKA JOURNAL
Many systems, including digital signal processors, finite impulse response (FIR) filters, application-specific integrated circuits, and microprocessors, use multipliers. The demand for low power multipliers is gradually rising day by day in the current technological trend. In this study, we describe a 4×4 Wallace multiplier based on a carry select adder (CSA) that uses less power and has a better power delay product than existing multipliers. HSPICE tool at 16 nm technology is used to simulate the results. In comparison to the traditional CSA-based multiplier, which has a power consumption of 1.7 µW and power delay product (PDP) of 57.3 fJ, the results demonstrate that the Wallace multiplier design employing CSA with first zero finding logic (FZF) logic has the lowest power consumption of 1.4 µW and PDP of 27.5 fJ.
This document describes the data processing flow in oblu. It also describes communication protocol using which one can access & control the data, set internal parameters and the processing at various stages, through an external
application platform.
---
Oblu is an opensource development board for wearable motion sensing. It is also an Arduino compatible programmable IMU for diverse inertial sensing applications. It comes pre-programmed as a shoe-mounted pedestrian dead reckoning PDR sensor for indoor navigation and personnel tracking. Real time tracking of first responders, robot navigation, geo-survey, understanding physics of motion, activity monitoring of elderly, gaming, VR etc are only few from the long list of applications which have been demonstrated using oblu.
Oblu is battery operable and uses Bluetooth Low Energy BLE for wireless data transmission. It is easily configurable and comes along with an Android application Xoblu for personnel tracking, a PC-based tool MIMUscope for detailed analysis and hardware accessories for ease of usage. It is based on opensource OpenShoe platform. Since beginning, Oblu has been distributed in 22 countries, to students, DIY enthusiasts, industrial & academic researchers, entrepreneurs etc. Oblu comes from the makers of Inertial Elements which is a famous for making multi-IMU array modules available commercially.
Implementation of Area & Power Optimized VLSI Circuits Using Logic TechniquesIOSRJVSP
To achieve the reduction of power consumption, optimizations are required at various levels of the design steps such as algorithm, architecture, logic and circuit & process techniques. This paper considers the two logic level approaches for low power digital design. Optimization techniques are carried to reduce switching activity power of individual logic-gates. we can reduce the power by using either circuit level optimization or logical level optimization. In this paper, the circuit level optimization process is followed to reduce the area and power. In the first approach, Modified gate diffusion input (GDI) logic is used in the proposed parallel asynchronous self time adder (PASTA) technique. Similarly, the structure of XOR gate and half adder is reduced to achieve the low area and low power. In second approach, Multi value logic based digital circuit is designed by increasing the representation domain from the two level (N=2) switching algebra to N > 2 levels. The main advantage of this approach is to compensate the inefficiency of existing integrated circuits that are used to implement the universal set of MVL gates. From the results, the proposed GDL logic based Adder offers less number of transistors (area) and low power consumption than the existing technique. And proposed MVL technique allows designing MVL digital circuit that is set to obtain the values from the binary circuits. Also this technique offers low power and small wiring delay, when compared to binary and three value logic. The simulation process is carried out by tanner toolv14.11 to check the functionality of the PASTA & MVL circuits.
Design and Implementation of 8 Bit Multiplier Using M.G.D.I. TechniqueIJMER
In this paper we have implemented Radix 8 High Speed Low Power Binary Multiplier using
Modified Gate Diffusion Input(M.G.D.I) technique. Here we have used “Urdhva-tiryakbhyam”(
Vertically and crosswise ) Algorithm because as compared to other multiplication algorithms it shows
less computation and less complexity since it reduces the total number of partial products to half of it.
This multiplier at gate level can be design using any technique such as CMOS, PTL and TG but design
with new MGDI technique gives far better result in terms of area, switching delay and power
dissipation. The radix 8 High Speed Low Power Pipelined Multiplier is designed with MGDI technique
in DSCH 3.5 and layout generated in Microwind tool. The Simulation is done using 0.12μm technology
at 1.2 v supply voltage and results are compared with conventional CMOS technique. Simulation result
shows great improvement in terms of area, switching delay and power dissipation.
Layout and Design Analysis of Carry Look Ahead Adder using 90nm Technology IJEEE
Addition is the fundamental operation in any digital system. The propagation time is more in addition due to large time required for the carry bits.A carry look ahead adder improves the speed by reducing the time required to solve carry bits. It is mostly used in electronics devices. An efficient implementation of two bit carry look ahead adder is proposed using fully automatic and semi-custom design steps. This paper is a comparison of complexity of automatic generated design against semi-custom design. A two bit CLA adder was designed in 90nm low power high speed technology. The performance of the CLA is measured by comparing the results in terms of power dissipation and area efficiency. Simulation results showed 56% gain in power and 28% in Area.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
2. -
Introduction
Digital design is a broad and amazing field. The application of digital design is present in our
daily life, including computers, calculators, video cameras etc. In fact there will be always
need for high speed and low power digital products which makes digital design a future
growing business.
ALU (arithmetic logic unit) is a critical component of a microprocessor and central
processing unit. Furthermore it is the heart of the instruction execution portion of every
computer. ALU's comprises of combinational logic that implements logical operations such
as AND, OR etc., and arithmetic operations such as ADD, SUB etc. ALU can be built with
various specifications. A simple ALU has two inputs for operands and one input for control
signal that selects the operation and one output for the result.
The goal of this project is to design a CHIP CONTROLLER consisting of a control unit, 16
bit ALU with memory which executes various arithmetic and logical operations. The
hardware uses accumulator or registers to store each result. When an input operand has been
read and the appropriate control signal has been passed to the control unit will perform the
computation and output the result. The control unit provides the necessary timing and control
signals to all the operations in the ALU.
- 2 - -
4. BLOCK DIAGRAM AND ITS
FUNCTIONALITY
2.1 BLOCK DIAGRAM
The main blocks of the processor are
Control unit
ALU
Memory
CONTROL UNIT
It is the main block of the processor which controls the ALU. The 16 bit inputs, one 6 bit
selection line and clock input are given to this control unit. So that for every positive edge of
the clock the control unit must take the input and also output is given by control unit. The
control unit then tells the ALU what operation to be performed on the data with the help of
selection lines. It also tells the ALU whether to access memory or not.
ALU
As far as the ALU in our design is concerned, it loads data from two 16 bit data lines. The
ALU performs operations on that data according to the instructions given by the control unit.
- 4 - -
CONTROL
UNIT
16-BIT
ALU
MEMORY
16 X 65KBITS
[15:0]INPUT1
[15:0]INPUT2
[5:0]SELECT
CLOCK
[15:0]OUTPUT
CARRY
RW
[15:0]OUT
CARRY
[15:0]INPUT1
[15:0]INPUT2
[3:0]OPC
[15:0]ADDR
[15:0]DATAIN
[15:0]DATAOUT
RW
EN
EN
5. BLOCK DIAGRAM AND ITS
FUNCTIONALITY
These instructions are given using opcode to ALU. ALU in our design can perform arithmetic
operations like addition, subtraction, multiplication, comparison and logic operations like
AND, OR, XOR, XNOR, NOR, BUFFER, NOT. The ALU which is used in our design is
static ALU.
MEMORY
The memory which is used in our design is Harvard memory because it can be used as data
memory only. Memory takes the address from 16 bit address line. It uses two lines for
reading and writing data from and to memory respectively. The size of memory used is 16 x
64 kbits.
- 5 - -
6. 2.2 OPCODES
S.N
O
OP-
CODE
SELECTION LINES FUNCTION MATHEMATICAL
REPRESENTAT-
ION
S[5] S[4] S[3] S[2] S[1] S[0]
1 32 1 0 0 0 0 0 Addition A + B
2 33 1 0 0 0 0 1 Subtraction A - B
3 34 1 0 0 0 1 0 Multiplication A * B
4 35 1 0 0 0 1 1 Or A | B
5 36 1 0 0 1 0 0 And A & B
6 37 1 0 0 1 0 1 Xor A ^ B
7 38 1 0 0 1 1 0 Not ~ A
8 39 1 0 0 1 1 1 Xnor A ~^ B
9 41 1 0 1 0 0 1 Comparison A>B,A<B,A=B
10 46 1 0 1 1 1 0 Buffer A
11 47 1 0 1 1 1 1 Buffer B
12 48 1 1 0 0 0 0 Addition A + MEM(B)
13 49 1 1 0 0 0 1 Subtraction A - MEM(B)
14 50 1 1 0 0 1 0 Multiplication A * MEM(B)
15 51 1 1 0 0 1 1 Or A | MEM(B)
16 52 1 1 0 1 0 0 And A & MEM(B)
17 53 1 1 0 1 0 1 Xor A ^ MEM(B)
18 54 1 1 0 1 1 0 Not ~ MEM(A)
19 55 1 1 0 1 1 1 Xnor A ~^ MEM(B)
20 57 1 1 1 0 0 1 Comparison A > MEM(B),
A < MEM(B),
A = MEM(B)
21 62 1 1 1 1 1 0 Buffer Move from memory
22 15 0 0 1 1 1 1 Buffer Move to memory
In selection lines each bit has its own significance. First four bits are used for selecting a
particular operation that has to perform on input data. Fifth bit tells the ALU
- 6 - -
8. ` DESIGN METHODOLOGY
3.1 DESIGN METHODOLOGY
There were several ways to approach creating the ALU. Our group wanted to make
use of the CAS (Complimentary Addition and Subtraction) unit so that 2’s complement
arithmetic could be performed in one operation rather than several executions through the
ALU. This led to the design in which multiple functions being executed simultaneously and
the desired output was chosen using a multiplexer network
3.2 ALU DESIGN
The ALU designed in this project performs ten different operations on two 16-bit
inputs with and without using memory. The advanced design utilizes the carry-look-ahead
method for carry generations in order to speed up the performance of the ALU.
3.3 ADDER/SUBTRACTOR UNIT
3.3.1 CARRY LOOK AHEAD GENERATOR:
The parallel adder discussed in the last paragraph is ripple carry type in which the
carry output of each full-adder stage is connected to the carry input of the next higher order
stage. Therefore, the sum and carry outputs of any stage cannot be produced until the input
carry occurs; this leads to a time delay in the addition process. This delay is known as carry
propagation delay, which can be best explained by considering the following addition,
0 1 0 1
+ 0 0 1 1
= 1 0 0 0
Addition of the LSB position produces a carry into the second position. This carry,
when added to the bits of the second position (stage), produces a carry into the third position.
The key thing to notice in this example is that the sum bit generated in the last position
(MSB) depends on the carry that was generated by the addition in the previous positions. This
means that, adder will not produce correct result until LSB carry has propagated through the
intermediate full-adders. This represents a time delay that depends on the propagation delay
produced in an each full-adder. For example, if each full-adder is considered to have a
propagation delay of 30 ns, then S3 will not reach its correct value until 90 ns after LSB carry
is generated. Therefore, total time required to perform addition is 90+30=120 ns.
- 8 - -
COUT
9. ` DESIGN METHODOLOGY
Obviously, this situation becomes much worse if we extend the adder circuit to add a
greater number of bits. If the adder were handling 16-bit numbers, the carry propagation
delay could be 480 ns.
One method of speeding up this process by eliminating inter stage carry delay is
called look ahead-carry addition. This method utilizes logic gates to look at the lower order
bits of the augends and addend to see if a higher order carry is o be generated. It uses two
functions: carry generate and carry propagate.
Consider the circuit of the full-adder show in fig 1. Here, we define two functions:
carry generate and carry propagate.
Pi = Ai + Bi
Gi = AiBi
The output sum and carry can be expressed as
Si = Pi + Ci
Ci+1 = Gi + PiCi
Gi is called a carry generate and it produces on carry when both Ai and Bi are one,
regardless of the input carry. Pi is called a carry propagate because it is term associated with
the propagation of the carry form Ci to Ci+1.
Now the Boolean function for the carry output of each stage can be written as follows,
C1 = G1 + P1C1
C3 = G2 + P2C2 = G2 + P2 (G1 + P1C1)
= G2 + P2G1 + P2P1C1
C4 = G3 + P3C3 = G3 + P3 (G2 + P2G1 + P2P1C1)
= G3 + P3G2 + P3P2G1 + P3P2P1C1
From the above Boolean function it can be seen that C4 does not have to wait for C3
and C2 to propagate; in fact C4 is propagated at the same time as C2 and C3.
The Boolean function for each output carry are expressed in sum of product form,
thus they can implemented using AND-OR logic or NAND-NAND logic. Fig 2 shows
implementation of Boolean functions for C2 , C3 and C4 using AND-OR logic.
3.3.6 LOGIC DIAGRAM OF A CARRY LOOK AHEAD GENERATOR
- 9 - -
10. ` DESIGN METHODOLOGY
Using a look
ahead carry
generator we can easily construct a 4-bit parallel adder with a look ahead carry scheme. Fig 3
shows a 4-bit parallel adder with a look ahead carry scheme. As shown in the fig 3, each sum
output requires two exclusive-OR gates. The output of the first exclusive-OR gate generates
Pi , and the AND gate generates Gi . The carries are generated using look-ahead carry
generator and applied as inputs to the second exclusive-OR gate. Other inputs to exclusive-
OR gate generate sum output. Each output is generated after a delay of two levels of gate.
Thus outputs S2 through S4 have equal propagation delay times.
- 10 - -
C4
C3
C2
C1
G1
G2
G3
P1
P2
P3
12. ` DESIGN METHODOLOGY
3.4 BINARY SUBTRACTOR:
The subtraction of unsigned binary numbers can be done most conveniently by means
of complements. Remember that the subtraction A-B can be done by taking the 2’s
complement of B and adding it to A. The 2’s complement can be obtained by taking the 1’s
complement and adding one to the least significant pair of bits. The 1’s complement can be
implemented with inverters and a one can be added to the sum through the input carry.
The circuit for subtracting A-B consists of an adder with inverters placed between
each data input B and the corresponding input of the full-adder. The input carry C0 must be
equal to 1 when performing subtraction. The operation thus performed becomes A, plus the
1’s complement of B, plus 1. This is equal to A plus the 2’s complement of B. For unsigned
numbers, this gives A-B if A>=B or the 2’s complement of (B-A) if A<B. For signed
numbers, the result is A-B, provided that there is no overflow.
The addition and subtraction operations can be combined into one circuit with one
common binary adder. This is done by including an exclusive-OR gate receives input M and
one of the inputs of B. When M=0, we have B+0=B. the full-adders receive the value of B,
the input carry is 0,and the circuit performs A plus B, when M=1we have B+1=B’ and C0=1.
The B inputs are all complemented and a is added through the input carry. The circuit
performs the operation A plus the 2’s complement of B (The exclusive-OR with output V is
for detecting an overflow).
It is worth noting that binary numbers in the signed-complement system are added
and subtracted by the same basic addition and subtraction rules as unsigned numbers.
Therefore, computers need only one common hardware circuit to handle both types of
arithmetic. The user or programmer must interpret the results of such addition or subtraction
differently, depending on whether it is assumed that the numbers are signed or unsigned.
3.5 MULTIPLICATION
Multiplication and division follow the same mathematical rules used in decimal
numbering. However, their implementation is substantially more complex as compared to
- 12 - -
13. ` DESIGN METHODOLOGY
addition and subtraction. Multiplication can be performed inside a computer in the same way
that a person does so on paper. Consider 12 × 12 = 144.
1 2
X 1 2
2 4 Partial product × 100
+ 1 2 Partial product × 101
1 4 4 Final product
The multiplication process grows in steps as the number of digits in each multiplicand
increases, because the number of partial products increases. Binary numbers function the
same way, but there easily can be many partial products, because numbers require more digits
to represent them in binary versus decimal. Here is the same multiplication expressed in
binary (1100 × 1100 = 10010000):
1 1 0 0
X 1 1 0 0
0 0 0 0 Partial product × 20
0 0 0 0 Partial product × 21
1 1 0 0 Partial product × 22
+ 1 1 0 0 Partial product × 23
1 0 0 1 0 0 0 0 Final product
Walking through these partial products takes extra logic and time, which is why
multiplication and, by extension, division are considered advanced operations that are not
nearly as common as addition and subtraction. Methods of implementing these functions
require trade-offs between logic complexity and the time required to calculate a final result.
To see how a binary multiplier can be implemented with a combinational circuit,
consider the multiplication of two 2-bit numbers as shown in figure. The multiplicand bits are
B1 and B0, the multiplier bits are A1 and A0, and the product is C3 C2 C1 C0. The first partial
product is formed by multiplying A0 by B1B0. The partial product can be implemented with
AND gates as shown in the diagram. The second partial product is formed by multiplying A1
by B1 B0 and shifting one position to the left. The two partial products are added with two
- 13 - -
14. ` DESIGN METHODOLOGY
half adder (HA) circuits. Usually there are more bits in the partial products and it is necessary
to use full adders to produce the sum of the partial products. Note that the least significant bit
of the product does not have to go through an adder since it is formed by the output of first
AND gate. B1 B0
B1 B0
A1 A0
A0B1 A0B0
A1B1 A1B0
C3 C2 C1 C0
A combinational
circuit binary multiplier
with more bits can be
constructed in a similar
fashion. A of multiplier is
ANDed with each bit of the multiplicand in as many levels as there are bits in the multiplier.
The binary output in each level of AND gates are added with the partial products of the
previous level to form a new partial product. The last level produces the product. For ‘J’
multiplier bits and ‘K’ multiplicands bits we need (J x K) AND gates and (J – 1) K- bit
adders to produce a product of J + K bits.
3.6 COMPARATOR
The comparison of two numbers is as operation that determines if one number is
greater than, less than, or equal to the other number. A magnitude comparator is a
combinational circuit that compares two numbers, A and B, and determines their relative
- 14 - -
HAHA
A0
A1
B1
B0
B1
B0
C1
C0
C3
C2
15. ` DESIGN METHODOLOGY
magnitudes. The outcome of the comparison is specified by three binary variables that
indicate whether A>B, A=B, or A<B.
The circuit for comparing two n-bit numbers has 22n
entries in the truth table and
becomes too cumbersome even with n=3. On the other hand, as one may suspect, a
comparator circuit possess a certain amount of regularity. Digital function that possesses an
inherent well-defined regularity can usually be designed by means of an algorithmic
procedure. An algorithm is a procedure that specifies a finite set of steps that, if followed,
give the solution to the problem. We illustrate this method here by deriving an algorithm for
the design of a 4-bit magnitude comparator.
The algorithm is a direct application of the procedure a person uses to compare the
relative magnitudes of two numbers. Consider two numbers, A and B, with four digits each.
Write the coefficients of the numbers with descending significance.
A=A3A2A1A0
B=B3B2B1B0
Each subscripted letter represents one of the digits in the number. The two numbers
are equal if all pairs of significant digits are equal: A3 = B3 and A2 = B2 and A1 = B1 and A0 = B0
. When the numbers are binary, the digits are either 1 or 0, and the equality relation of each
pair of bits can be expressed logically with an exclusive-OR function as
Xi = AiBi for i = 0,1,2,3
Where xi =1 only if the pair of bits in position I are equal (i.e., if both are 1 or both are 0).
The equality of two numbers, A and B, is displayed in a combinational circuit by an
output binary variable that we designate by the symbol (A=B). This binary variable is equal
to 1 if the input numbers, A and B, are equal, and it is equal to 0 otherwise. For the equality
condition to exist, all xi variables must be equal to 1. This dictates an AND operation of all
variables:
(A=B)=x3x2x1x0
The binary variable (A=B) is equal to 1 only if all pairs of digits of the two numbers
are equal.
To determine if A is greater than or less than B, we inspect the relative magnitudes of
pairs of significant digits starting from the most significant position. If the two digits are
equal, we compare the next lower significant pair of digits. This comparison continues until a
pair of unequal digits is reached. If the corresponding digit of A is 1 and that of B is 0, we
conclude that A>B. if the corresponding digit of A is 0 and that of B is 1, we have that A<B.
the sequential comparison can be expressed logically by the two Boolean functions
- 15 - -
16. ` DESIGN METHODOLOGY
(A>B)=A3B’3 + x3A2B’2 + x3x2A1B’1 + x3x2x1A0B’0
(A<B)=A’3B3 + x3A’2B2 + x3 x2A’1B1 + x3x2x1A’0B0
The symbols (A>B) and (A<B) are binary output variables that are equal to 1 when A>B or
A<B, respectively.
3.6.1. 4-BIT MAGNITUDE COMPARATOR
The gate implementation of the three output variables just derived is simpler than it
seems because it involves a certain amount of repetition. The unequal outputs can use the
same gates that are needed to generate the equal output. The logic diagram of the 4-bit
magnitude comparator is shown in fig 1. The four x outputs are generated with exclusive-
NOR circuits and applied to an AND gate to give the output binary variable (A=B). The other
two outputs use the x variables to generate the Boolean functions listed previously. This is a
multilevel implementation and has a regular pattern. The procedure for obtaining magnitude
comparator circuits for binary numbers with more than four bits is obvious from this
example.
- 16 - -
A3
A2
A1
A0
B3
B2
B1
B0
X3
X2
X1
X0
(A<
B)
(A>B
)
(A=B
)
17. ` DESIGN METHODOLOGY
3.7 LOGIC GATES
Logic gates are the building blocks of digital electronics. The fundamental logic gates
include the INVERT (NOT), AND, NAND, OR, exclusive OR (XOR), and exclusive NOR
(XNOR) gates. Each of these gates performs a different logical operation. A description of
what each logic gate does and switch and transistor analogy for each gate is disused here.
3.7.1 INVERTER (NOT)
SYMBOL:
TRUTH TABLE:
ELECTRONIC IMPLEMENTATION OF INVERTER:
NMOS
Inverter
PMOS
Inverter
Static CMOS
Inverter
Schematic of a Saturated-Load Digital
Inverter
DESCRIPTION:
Y=~A
A NOT gate or invertors. Output logic level is opposite to that of the input logic
level
- 17 - -
INPU
T
OUTPUT
A NOT A
0 1
1 0
18. ` DESIGN METHODOLOGY
3.7.2 AND
SYMBOL:
TRUTH TABLE:
ELECTRONIC IMPLEMENTATION OF INVERTER:
DESCRIPTION:
` Y=A & B.
The output of the AND gate is high only when both the inputs are high.
- 18 - -
INPU
T
OUTPUT
A B A AND B
0 0 0
0 1 0
1 0 0
1 1 1
19. ` DESIGN METHODOLOGY
3.7.3 OR
SYMBOL:
TRUTH TABLE:
ELECTRONIC IMPLEMENTATION OF INVERTER:
CMOS OR Gate
DESCRIPTION:
Y= A | B
The output of the OR gate is high when one or both the inputs are high.
- 19 - -
INPU
T
OUTPUT
A B A OR B
0 0 0
0 1 1
1 0 1
1 1 1
20. ` DESIGN METHODOLOGY
3.7.4 XOR
SYMBOL:
TRUTH TABLE:
ELECTRONIC IMPLEMENTATION OF INVERTER:
DESCRIPTION:
OUT=A ^ B
The output of the XOR gate goes high if both the inputs are same.
- 20 - -
INPU
T
OUTPUT
A B A XOR B
0 0 0
0 1 1
1 0 1
1 1 0
21. ` DESIGN METHODOLOGY
3.7.5 XNOR
SYMBOL:
TRUTH TABLE:
ELECTRONIC IMPLEMENTATION OF INVERTER:
DESCRIPTION:
Y=A ^~ B
The output of the XOR gate goes high if both the inputs are different
- 21 - -
INPU
T
OUTPUT
A B A XNOR B
0 0 1
0 1 0
1 0 0
1 1 1
24. MEMORY
4.1 MEMORY
Since the dawn of the electronic era, memory or storage devices have been an integral
part of electronic systems. As the electronic industry matured and moved away from vacuum
tubes to semiconductor devices, research in the area of semiconductor memories also
intensified. Semiconductor memory uses semiconductor-based integrated circuits to store
information. The semiconductor memory industry evolved and prospered along with the
digital computer revolution. Today, semiconductor memory arrays are widely used in many
VLSI subsystems, including microprocessors and other digital systems. In these systems, they
are used to store programs and data and in almost all cases have replaced core memory as the
active main memory. More than half of the real estate in many state-of-the art
microprocessors is devoted to cache memories, which are essentially semiconductor memory
arrays. System designer’s (both hardware and software) unmitigated quest for more memory
capacity has accelerated the growth of the semiconductor memory industry. One of the
factors that determine a digital computer’s performance improvement is its ability to store
and retrieve massive amounts of data quickly and inexpensively. Since the beginning of the
computer age, this fact has led to the search for ideal memories. The ideal memory would be
low cost, high performance, high density, with low-power dissipation, random access,
nonvolatile, easy to test, highly reliable, and standardized throughout the industry.
Unfortunately, a single memory having all these characteristics has not yet been developed,
although each of the characteristics is held by one or another of the MOS memories. Today,
MOS memories dominate the semiconductor memory market.
4.2. MEMORY CLASSIFICATION
Semiconductor memories can be classified in many different ways. Semiconductor
memories are generally classified based on the basic operation mode, nature of the data
storage mechanism, access patterns, and the storage cell operation.
Basic operation mode: Some memory circuits allow modification of information. In
other words, we can read data from the memory and write new data into the memory,
whereas other types of memory only allow reading of prewritten information. On the basis of
this criterion, memories are classified into two major categories: Read=write memories
(RWMs) and ROMs. RWMs are more popularly referred to as random access memories
(RAMs). In the early days, RAMs were referred to by that name to contrast them with non
semiconductor memories such as magnetic tapes that allow only sequential access. It should
be noted that ROMs also allow random access the way RAMs do; however, they are not
generally called RAMs.
- 24 - -
25. MEMORY
Storage mode: On the basis of its ability to retain the stored information with respect
to the ON=OFF state of the power supply, semiconductor memories can be classified into
two types: volatile and nonvolatile memories. Volatile memory loses all the stored
information once the power supply is turned OFF. RAM is an example of volatile memory.
Nonvolatile memory, on the other hand, retains the stored information even when the power
supply is turned OFF. ROMs and flash memories are examples of nonvolatile memories.
Nonvolatile memories can be further divided into two categories: nonvolatile ROMs (e.g.,
mask-programmed ROM) and nonvolatile read–write memories (e.g., Flash, EPROM, and
EEPROM) (Table).
Table: Memory Classification
Access patterns: On the basis of the order in which data can be accessed, memories
can be classified into two different categories: RAMs and non-RAMs. Most memories belong
to the random access class. In RAMs, information can be stored or retrieved in a random
order at a fixed rate, independent of physical location. There are two kinds of RAMs: static
random access memories (SRAMs) and dynamic random access memories (DRAMs). In
SRAMs, data is stored in a latch and it retains the data written on the cell as long as the
power supply to the memory is retained. In DRAMs, the data is stored in a capacitance as
electric charge and the written data needs to be periodically refreshed to compensate for the
charge leakage of the capacitance. It should be noted that both SRAM and DRAM are
volatile memories, i.e., they lose the written information as soon as the power supply is
turned OFF.
Examples of non-RAMs are serial access memory (SAM) and content address
memories (CAMs). SAM can be visualized as the opposite of RAM. SAM stores data as a
series of memory cells that can only be accessed sequentially. If the data is not in the current
location, each memory cell is checked until the needed data is found. SAM works very well
for memory buffers, where the data is normally stored in the order in which it will be used.
- 25 - -
26. MEMORY
Texture buffer memory on a video card is an example of SAM. In RAM, we give an address
to the memory chip and we can retrieve the information stored in that particular address. But
a CAM is designed such that when a data word (an assemblage of bits usually the width of
the address bus) is supplied to the chip, the CAM searches its entire memory to see if that
data word is stored anywhere in the chip. If the data word is found, the CAM returns a list of
one or more storage addresses where the word was found and in some architectures, it also
returns the data word.
Finally, there needs to be a way to denote how much data can be stored by any
particular memory device. This, fortunately for us, is very simple and straightforward: just
count up the number of bits (or bytes, 1 byte = 8 bits) of total data storage space. Due to the
high capacity of modern data storage devices, metric prefixes are generally affixed to the unit
of bytes in order to represent storage space: 1.6 Gigabytes is equal to 1.6 billion bytes, or
12.8 billion bits, of data storage capacity. The only caveat here is to be aware of rounded
numbers. Because the storage mechanisms of many random-access memory devices are
typically arranged so that the number of "cells" in which bits of data can be stored appears in
binary progression (powers of 2), a "one kilobyte" memory device most likely contains 1024
(2 to the power of 10) locations for data bytes rather than exactly 1000. A "64 kbyte" memory
device actually holds 65,536 bytes of data (2 to the 16th power), and should probably be
called a "66 Kbyte" device to be more precise. When we round numbers in our base-10
system, we fall out of step with the round equivalents in the base-2 system.
One simple memory circuit is called the data latch, or D-latch. This is a device which,
when “told” to do so via the clock input, notes the state of its input and holds that state at its
output. The output state remains unchanged even if the input state changes, until another
update request is received. Traditionally, the input of the D-latch is designated by D and the
latched output by Q. The update command is provided by asserting the clock input in the
form of a transition (from HI to LO) or (from LO to HI), so-called edge-triggered devices or
level triggered devices, where the output follows the input whenever the clock is HI.
- 26 - -
27. MEMORY
D-Latch Symbol and Truth Tables
Data present on the input D is passed to the outputs Q and Q when the clock is
asserted. The truth table for an edge-triggered D-latch is shown to the right of the schematic
symbol. Some D-latches also have preset and Clear inputs that allow the output to be set HI
or LO independent of the clock signal. In normal operation, these two inputs are pulled high
so as not to interfere with the clocked logic. However, the outputs Q and Q can be initialized
to a known state, using the Preset and Clear inputs when the clocked logic is not active.
- 27 - -
29. VERILOG
5.1 VERILOG
In the semiconductor and electronic design industry, Verilog is a hardware description
language (HDL) used to model electronic systems. Verilog HDL, not to be confused with
VHDL, is most commonly used in the design, verification, and implementation of digital
logic chips at the Register transfer level (RTL) level of abstraction. It is also used in the
verification of analog and mixed-signal circuits
5.2 HISTORY OF VERILOG
Beginning
Verilog was invented by Phil Moorby and Prabhu Goel during the winter of 1983/1984 at
Automated Integrated Design Systems (later renamed to Gateway Design Automation in
1985) as a hardware modeling language. Gateway Design Automation was later purchased by
Cadence Design Systems in 1990. Cadence now has full proprietary rights to Gateway's
Verilog and the Verilog-XL simulator logic simulators.
Verilog-95
With the increasing success of VHDL at the time, Cadence decided to make the language
available for open standardization. Cadence transferred Verilog into the public domain under
the Open Verilog International (OVI) (now known as Accellera) organization. Verilog was
later submitted to IEEE and became IEEE Standard 1364-1995, commonly referred to as
Verilog-95.
In the same time frame Cadence initiated the creation of Verilog-A to put standards support
behind its analog simulator Spectre. Verilog-A was never intended to be a standalone
language and is a subset of Verilog-AMS which encompassed Verilog-95.
Verilog 2001
Extensions to Verilog-95 were submitted back to IEEE to cover the deficiencies that users
had found in the original Verilog standard. These extensions became IEEE Standard 1364-
2001 known as Verilog-2001.
Verilog-2001 is a significant upgrade from Verilog-95. First, it adds explicit support for (2's
complement) signed nets and variables. Previously, code authors had to perform signed-
operations using awkward bit-level manipulations (for example, the carry-out bit of a simple
8-bit addition required an explicit description of the boolean-algebra to determine its correct
value.) The same function under Verilog-2001 can be more succinctly described by one of
the built-in operators: +, -, /, *, >>>. A generate/endgenerate construct (similar to VHDL's
generate/endgenerate) allows Verilog-2001 to control instance and statement instantiation
30. VERILOG
through normal decision-operators (case/if/else). Using generate/endgenerate, Verilog-2001
can instantiate an array of instances, with control over the connectivity of the individual
instances. File I/O has been improved by several new system-tasks. And finally, a few syntax
additions were introduced to improve code-readability (eg. always @*, named-parameter
override, C-style function/task/module header declaration.)
Verilog-2001 is the dominant flavor of Verilog supported by the majority of commercial
EDA software packages.
Verilog 2005
Not to be confused with SystemVerilog, Verilog 2005 (IEEE Standard 1364-2005) consists of
minor corrections, spec clarifications, and a few new language features (such as the uwire
keyword.)
A separate part of the Verilog standard , Verilog-AMS, attempts to integrate analog and
mixed signal modelling with traditional Verilog.
SYSTEM VERILOG
SystemVerilog is a superset of Verilog-2005, with many new features and capabilities to aid
design-verification and design-modeling.
The advent of High Level Verification languages such as OpenVera, and Verisity's E
language encouraged the development of Superlog by Co-Design Automation Inc. Co-Design
Automation Inc was later purchased by Synopsys. The foundations of Superlog and Vera
were donated to Accellera, which later became the IEEE standard P1800-2005:
SystemVerilog.
5.3 ABOUT LANGUAGE
Hardware description languages, such as Verilog, differ from software programming
languages in several fundamental ways. HDLs add the concept of concurrency, which is
parallel execution of multiple statements in explicitly specified threads, propagation of time,
and signal dependency (sensitivity). There are two assignment operators, a blocking
assignment (=), and a non-blocking (<=) assignment. The non-blocking assignment allows
designers to describe a state-machine update without needing to declare and use temporary
storage variables. Since these concepts are part of the Verilog's language semantics, designers
could quickly write descriptions of large circuits, in a relatively compact and concise form.
At the time of Verilog's introduction (1984), Verilog represented a tremendous productivity
31. VERILOG
improvement for circuit designers who were already using graphical schematic-capture, and
specially-written software programs to document and simulate electronic circuits.
The designers of Verilog wanted a language with syntax similar to the C programming
language, which was already widely used in engineering software development. Verilog is
case-sensitive, has a basic preprocessor (though less sophisticated than ANSI C/C++), and
equivalent control flow keywords (if/else, for, while, case, etc.), and compatible language
operators precedence. Syntactic differences include variable declaration (Verilog requires bit-
widths on net/reg types), demarcation of procedural-blocks (begin/end instead of curly braces
{}), though there are many other minor differences.
A Verilog design consists of a hierarchy of modules. Modules encapsulate design hierarchy,
and communicate with other modules through a set of declared input, output, and
bidirectional ports. Internally, a module can contain any combination of the following:
net/variable declarations (wire, reg, integer, etc.), concurrent and sequential statement blocks
and instances of other modules (sub-hierarchies). Sequential statements are placed inside a
begin/end block and executed in sequential order within the block. But the blocks themselves
are executed concurrently, qualifying Verilog as a Dataflow language.
Verilog's concept of 'wire' consists of both signal values (4-state: "1, 0, floating, undefined"),
and strengths (strong, weak, etc.) This system allows abstract modeling of shared signal-lines,
where multiple sources drive a common net. When a wire has multiple drivers, the wire's
(readable) value is resolved by a function of the source drivers and their strengths.
A subset of statements in the Verilog language is synthesizable. Verilog modules that
conform to a synthsizeable coding-style, known as RTL (register transfer level), can be
physically realized by synthesis software. Synthesis-software algorithmically transforms the
(abstract) Verilog source into a netlist, a logically-equivalent description consisting only of
elementary logic primitives (AND, OR, NOT, flipflops, etc.) that are available in a specific
VLSI technology. Further manipulations to the netlist ultimately lead to a circuit fabrication
blueprint (such as a photo mask-set for an ASIC), or a bitstream-file for an FPGA)
There are now two industry standard hardware description languages, VHDL and Verilog.
The complexity of ASIC and FPGA designs has meant an increase in the number of specialist
design consultants with specific tools and with their own libraries of macro and mega cells
written in either VHDL or Verilog. As a result, it is important that designers know both
VHDL and Verilog and that EDA tools vendors provide tools that provide an environment
allowing both languages to be used in unison. For example, a designer might have a model of
32. VERILOG
a PCI bus interface written in VHDL, but wants to use it in a design with macros written in
Verilog.
VHDL (Very high speed integrated circuit Hardware Description Language) became IEEE
standard 1076 in 1987. It was updated in 1993 and is known today as "IEEE standard 1076
1993". The Verilog hardware description language has been used far longer than VHDL and
has been used extensively since it was launched by Gateway in 1983. Cadence bought
Gateway in 1989 and opened Verilog to the public domain in 1990. It became IEEE standard
1364 in December 1995.
There are two aspects to modeling hardware that any hardware description language
facilitates; true abstract behavior and hardware structure. This means modeled hardware
behavior is not prejudiced by structural or design aspects of hardware intent and that
hardware structure is capable of being modeled irrespective of the design's behavior.
5.4 VHDL/VERILOG COMPARED & CONTRASTED
This section compares and contrasts individual aspects of the two languages; they are listed in
alphabetical order.
Capability
Hardware structure can be modeled equally effectively in both VHDL and Verilog. When
modeling abstract hardware, the capability of VHDL can sometimes only be achieved in
Verilog when using the PLI. The choice of which to use is not therefore based solely on
technical capability but on: personal preferences
EDA tool availability commercial, business and marketing issues
The modeling constructs of VHDL and Verilog cover a slightly different spectrum across the
levels of behavioral abstraction; see Figure 1.
HDL modeling capability
33. VERILOG
COMPILATION
VHDL, Multiple design-units (entity/architecture pairs), that resides in the same system file
may be separately compiled if so desired. However, it is good design practice to keep each
design unit in it's own system file in which case separate compilation should not be an issue.
The Verilog language is still rooted in it's native interpretative mode. Compilation is a means
of speeding up simulation, but has not changed the original nature of the language. As a result
care must be taken with both the compilation order of code written in a single file and the
compilation order of multiple files. Simulation results can change by simply changing the
order of compilation.
DATA TYPES
Verilog when Compared to VHDL, Verilog data types a re very simple, easy to use and very
much geared towards modeling hardware structure as opposed to abstract hardware modeling.
Unlike VHDL, all data types used in a Verilog model are defined by the Verilog language
and not by the user. There are net data types, for example wire, and a register data type called
reg. A model with a signal whose type is one of the net data types has a corresponding
electrical wire in the implied modeled circuit. Objects, those are signals, of type reg hold their
value over simulation delta cycles and should not be confused with the modeling of a
hardware register. Verilog may be preferred because of it's simplicity.
Design reusability
Verilog, There is no concept of packages in Verilog. Functions and procedures used within a
model must be defined in the module. To make functions and procedures generally accessible
from different module statements the functions and procedures must be placed in a separate
system file and included using the `include compiler directive.
Easiest to Learn
Starting with zero knowledge of either language, Verilog is probably the easiest to grasp and
understand. This assumes the Verilog compiler directive language for simulation and the PLI
language is not included. If these languages are included they can be looked upon as two
additional languages that need to be learned. VHDL may seem less intuitive at first for two
primary reasons. First, it is very strongly typed; a feature that makes it robust and powerful
for the advanced user after a longer learning phase. Second, there are many ways to model
the same circuit, especially those with large hierarchical structures.
Forward and back annotation
A spin-off from Verilog is the Standard Delay Format (SDF). This is a general purpose
format used to define the timing delays in a circuit. The format provides a bidirectional link
34. VERILOG
between, chip layout tools, and either synthesis or simulation tools, in order to provide more
accurate timing representations. The SDF format is now an industry standard in it's own right.
High level constructs Verilog. Except for being able to parameterize models by overloading
parameter constants, there is no equivalent to the high-level VHDL modeling statements in
Verilog.
LANGUAGE EXTENSIONS
The use of language extensions will make a model non standard and most likely not portable
across other design tools. However, sometimes they are necessary in order to achieve the
desired results.
Verilog The Programming Language Interface (PLI) is an interface mechanism between
Verilog models and Verilog software tools. For example, a designer, or more likely, a Verilog
tool vendor, can specify user defined tasks or functions in the C programming language, and
then call them from the Verilog source description. Use of such tasks or functions make a
Verilog model nonstandard and so may not be usable by other Verilog tools. Their use is not
recommended.
Libraries
Verilog. There is no concept of a library in Verilog. This is due to it's origins as an
interpretive language.
Low Level Constructs
Verilog. The Verilog language was originally developed with gate level modeling in mind,
and so has very good constructs for modeling at this level and for modeling the cell
primitives of ASIC and FPGA libraries. Examples include User Defined Primitive s (UDP),
truth tables and the specify block for specifying timing delays across a module.
Managing large designs
Verilog. There are no statements in Verilog that help manage large designs.
Operators
The majority of operators are the same between the two languages. Verilog does have very
useful unary reduction operators that are not in VHDL. A loop statement can be used in
VHDL to perform the same operation as a Verilog unary reduction operator. VHDL has the
mod operator that is not found in Verilog.
Parameterizable models
Verilog. A specific width model can be instantiated from a generic n-bit model using
overloaded parameter values. The generic model must have a default parameter value
defined. This means two things. In the absence of an overloaded value being specified, it will
35. VERILOG
still synthesize, but will use the specified default parameter value. Also, it does not need to be
instantiated with an overloaded parameter value specified, before it will synthesize.
Procedures and tasks
VHDL allows concurrent procedure calls; Verilog does not allow concurrent task calls.
Readability
This is more a matter of coding style and experience than language feature. VHDL is a
concise and verbose language; its roots are based on Ada. Verilog is more like C because it's
constructs are based approximately 50% on C and 50% on Ada. For this reason an existing C
programmer may prefer Verilog over VHDL. Although an existing programmer of both C
and Ada may find the mix of constructs somewhat confusing at first. Whatever HDL is used,
when writing or reading an HDL model to be synthesized it is important to think about
hardware intent.
Structural replication
Verilog. There is no equivalent to the generate statement in Verilog.
Test harnesses
Designers typically spend about 50% of their time writing synthesizable models and the other
50% writing a test harness to verify the synthesizable models. Test harnesses are not
restricted to the synthesizable subset and so are free to use the full potential of the language.
VHDL has generic and configuration statements that are useful in test harnesses, that are not
found in Verilog.
Verboseness
Verilog. Signals representing objects of different bits widths may be assigned to each other.
The signal representing the smaller number of bits is automatically padded out to that of the
larger number of bits, and is independent of whether it is the assigned signal or not. Unused
bits will be automatically optimized away during the synthesis process. This has the
advantage of not needing to model quite so explicitly as in VHDL, but does mean unintended
modeling errors will not be identified by an analyzer.
37. CADENCE
6.1 CADENCE TOOLS
The Cadence suite is a huge collection of programs for different CAD applications
from VLSI design to high-level DSP programming. The suite is divided into different
“packages,” and for VLSI design, the packages we will be using are the IC package and the
DSMSE package.
The Cadence toolset is a complete microchip EDA system, which is intended to
develop professional, full-scale, mixed-signal microchips and breadboards. The modules
included in the toolset are for schematic entry, design simulation, data analysis, physical
layout, and final verification. The strength of the Cadence tools is in its analog
design/simulation/layout and mixed signal verification and is often used in tandem with other
tools for RF and/or digital design/simulation/layout, where complete top-level verification is
done in the Cadence tools. Another important concept is that the Cadence tools only provide
a framework for doing design. Without a foundry-provided design kit, no design can be
done.
Cadence Design Systems, Inc. (NASDAQ: CDNS), the leader in global electronic-design
innovation, today said Global Unichip Corporation (GUC), a leading system-on-chip (SoC)
design foundry, is the first Taiwan-based design company to complete a successful tape out
of a 65-nanometer device. The success of this 65-nanometer tape out further strengthened
GUC's advanced technology capabilities to serve the top tier customers worldwide. GUC
used the Cadence(R) Low-Power Solution and SoC Encounter(TM) GXL RTL-to-GDSII
system to achieve the tape out.
6.2 ABOUT CADENCE COMPANY
Cadence enables global electronic-design innovation and plays an essential role in the
creation of today's integrated circuits and electronics. Customers use Cadence software and
hardware, methodologies, and services to design and verify advanced semiconductors,
consumer electronics, networking and telecommunications equipment, and computer
systems. Cadence reported 2006 revenues of approximately $1.5 billion, and has
approximately 5,200 employees. The company is headquartered in San Jose, Calif., with sales
offices, design centers, and research facilities around the world to serve the global electronics
Since ours is digital designing the tools used in our project are:
38. CADENCE
IUS - Incisive Unified Simulator.
RC-RTL Compiler.
SOC encounter-System On Chip encounter.
This tool work on 18nanometer technology.
Now, we will study in detail about these tools and results of our project using these tools.
FLOW OF DESIGN USING CADENCE TOOLS
39. 6.3 INCISIVE UNIFIED SIMULATOR
Incisive Unified Simulator is a tool used to simulate digital circuits. The designs are
represented using many different languages such as Verilog or VHDL. IUS supports those language as well
additional languages used for specialized verification functions, such as SystemC, a derivative of C++. The tool
handles any design that can be represented using a digital representation with the key languages. The Verilog
only environment is called NC-Verilog and the VHDL one is called NC-VHDL. Designers depending on the
complexity of their simulation tasks will create environments that use multiple languages to perform advanced
verification tasks.
Who needs IUS:–System architects who need to do analysis on various scenarios to determine what the
right grouping of components would be. This is typically done with simple IP models to look at high level
behavior.–Design engineers who are creating the various parts of the circuit use IUS to test the behavior and
make sure the requirements are met–Verification Engineers are a specialized team that take the design once it is
completed and create test that exercise the complete design testing actual conditions as best as possible. –IP
vendors used IUS to create IP models and ensure that their models behaves correctly with the tools that their
customers will use.–Board designers will use IUS as means
BENEFITS
• Speeds time-to-market with lower risk and higher predictability.
• Increases productivity by enabling verification to start months earlier, before test bench development and
simulation.
• Improves quality and reduces risk of re-spins by exposing corner-case functional bugs that are difficult or
impossible to find using conventional methods.
• Reduces block design effort and debug time, and shortens integration time .
• Provides design teams with an advanced debug environment with simulation synergies for ease-of adoption.
• Offers the ultimate simulation-based speed and efficiency.
• Increases RTL performance by 100 times with native transaction-level simulation and optional Acceleration-
on-Demand.
• Reduces test bench development up to 50% with transaction-level support, unified test generation, and
verification component re-use.
• Shortens verification time, finds bugs faster, and eliminates exhaustive simulation runs with dynamic assertion
checking.
• Decreases debug time up to 25% through unified transaction/signal viewing, HDL analysis capability, and
unified debug environment for all languages.
40. This program is a front-end to some of the other tools in this directory. Its job is to compile, elaborate,
and launch the simulation.
ncvlog
This is the Verilog compiler. Typing this command in with no arguments gives you a listing of the possible
options. Two arguments which are useful are -cdslib, which specifies the location of your cds.lib file, and
-work, which specifies the location of your worklib file.
To compile a Verilog file named test.v, and its test bench named tb_test.v, using a cdslib of cds.lib, and a work
library of worklib, you can execute the following command:
ncvlog -cdslib cds.lib -work worklib test.v tb_test.v
ncelab
This is the elaborator. Again, typing this command with no arguments outputs a list of options. The above
-cdslib and -work arguments apply.
To elaborate an compiled test bench called tb_test.v, execute the following command:
ncelab -cdslib cds.lib -work worklib worklib.tb_test_v
ncsim
This is the actual simulator.
To launch a compiled and elaborated test bench, execute the following command:
ncsim -gui -cdslib cds.lib -work worklib worklib.tb_test_v:module