This document discusses key trade-offs in chip design including time, area, power, reliability, and configurability. It covers topics like cycle time, die area and cost, ideal and practical scaling, power consumption, and how these factors relate to processor design trade-offs between area, time and power. Key considerations in design include optimizing the pipeline for cycle time, minimizing die area and maximizing yield, accounting for the increasing dominance of wire delays over gate delays with scaling, and balancing dynamic and static power sources.
This lesson on System-on-Chip was given for the course "Advanced Platform Architectures and Mapping Methods for Embedded Applications" at the KU Leuven and is based on chapter 8 of 'A Practical Introduction to Hardware Software Codesign (Schaumont P.)'
This ppt explains in brief what actually is arm processor and it covers the first 3 chapters of book "ARM SYSTEM DEVELOPERS GUIDE". The 3 chapters include the history,architecture,instruction set etc.
ARM (Advance RISC Machine) is one of the most licensed and thus widespread processor cores in the world.Used especially in portable devices due to low power consumption and reasonable performance.Several interesting extension available like THUMB instruction set and Jazelle Java Machine.
ARM Microcontroller and Embedded Systems (17EC62) – ARM – 32 bit Microcontrol...Shrishail Bhat
Lecture Slides for ARM Microcontroller and Embedded Systems (17EC62) – ARM – 32 bit Microcontroller (Module 1) for VTU Students
Contents
Thumb-2 technology and applications of ARM, Architecture of ARM Cortex M3, Various Units in the architecture, Debugging support, General Purpose Registers, Special Registers, exceptions, interrupts, stack operation, reset sequence.
Textbook: Joseph Yiu, “The Definitive Guide to the ARM Cortex-M3”, 2nd Edition, Newnes (Elsevier), 2010
This lesson on System-on-Chip was given for the course "Advanced Platform Architectures and Mapping Methods for Embedded Applications" at the KU Leuven and is based on chapter 8 of 'A Practical Introduction to Hardware Software Codesign (Schaumont P.)'
This ppt explains in brief what actually is arm processor and it covers the first 3 chapters of book "ARM SYSTEM DEVELOPERS GUIDE". The 3 chapters include the history,architecture,instruction set etc.
ARM (Advance RISC Machine) is one of the most licensed and thus widespread processor cores in the world.Used especially in portable devices due to low power consumption and reasonable performance.Several interesting extension available like THUMB instruction set and Jazelle Java Machine.
ARM Microcontroller and Embedded Systems (17EC62) – ARM – 32 bit Microcontrol...Shrishail Bhat
Lecture Slides for ARM Microcontroller and Embedded Systems (17EC62) – ARM – 32 bit Microcontroller (Module 1) for VTU Students
Contents
Thumb-2 technology and applications of ARM, Architecture of ARM Cortex M3, Various Units in the architecture, Debugging support, General Purpose Registers, Special Registers, exceptions, interrupts, stack operation, reset sequence.
Textbook: Joseph Yiu, “The Definitive Guide to the ARM Cortex-M3”, 2nd Edition, Newnes (Elsevier), 2010
Pragmatic Optimization in Modern Programming - Mastering Compiler OptimizationsMarina Kolpakova
Explains compilers optimizations, gives taxanomy and examples. The examples are mostly compiler for ARM armv7-a and armv8-a targets, but most of optimizations are machine independent.
Pragmatic Optimization in Modern Programming - Ordering Optimization ApproachesMarina Kolpakova
The slides give an idea about how to look pragmatically at software optimization and order optimization approaches according to this pragmatic point of view
Introduction to SOC Verification Fundamentals and System Verilog language coding. Explains concepts on Functional Verification methodologies used in industry like OVM, UVM
Semiconductor Hubs for Research & InnovationZinnov
The semiconductor industry has evolved significantly in the last 50 years. While in early 60s, US was the clear market leader, by the 90s the semiconductor industry in Taiwan, Singapore and Korea posed a competitive threat to that in the US. Recent times have witnessed other locations in China and India establish themselves firmly on the global semiconductor landscape.
For any innovation hub, the entire ecosystem has to be favorable for growth. This includes access to large skilled talent pool, strong university ecosystem, favorable government policies etc.
Understanding printed board assembly using simulation with design of experime...Kiran Hanjar
Understanding PCB assembly using simulation with DOE approach
To assess the feasibility of process flow logic and relative impact of changing line configurations
It is aimed to identify constraints or bottlenecks and development of improvement strategies accordingly
By using DOE, the factors that are affecting the system’s efficiency are identified
Finally to improve the system’s overall performance
Design Analysis of Delay Register with PTL Logic using 90 nm TechnologyIJEEE
This paper presents low area and power efficient delay register using CMOS transistors. The proposed register has reduced area than the conventional register. This resistor design consists of 6 NMOS and 6 PMOS. The proposed delay register has been designed in logic editor and simulated using 90nm technology. Also the layout simulation and parametric analysis has been done to find out the results. In this paper register has been designed using full automatic layout design and semicustom layout design. Then the performance of these different designs has been analyzed and compared in terms of power, delay and area. The simulation result shows that circuit design of delay register using PTL techniques improved by power 0.05% and 61.8% area.
Clock tree synthesis (CTS) plays an important role in building well-balanced clock tree, fixing timing violations and reducing the extra unnecessary pessimism in the design. The goal during building a clock tree is to reduce the skew, maintain symmetrical clock tree structure and to cover all the registers in the design. We have captured some problematic scenarios and the problem solving approaches in this article.
Clock tree network enables in making design clean from a timing perspective. However, it is responsible for more than one third of the total power consumption of the chip. The impact of variations in the clock path is more than 2 times the other paths in the design. These variations in-turn affects the timing paths. Let us take an example; Due to the variation, if the clock path to the launching register is slowed down by 100ps and the clock path to the capturing register is fastened by 100ps then it impacts the setup constraint by adding 200ps more to it, this in-turn affects the timing path by making it more critical. Here we can see the importance of building a balanced clock tree. We will discuss on the timing improvements and methods to reduce the variations in the clock tree. The steps followed in building a customized clock tree and the steps followed to bring down the variations in the clock tree has been depicted in the following sections.
Have you ever wondered how to speed up your code in Python? This presentation will show you how to start. I will begin with a guide how to locate performance bottlenecks and then give you some tips how to speed up your code. Also I would like to discuss how to avoid premature optimization as it may be ‘the root of all evil’ (at least according to D. Knuth).
B tech Projects,Final Year Projects,Engineering ProjectsTechnogroovy
like our page for more updates:
https://www.facebook.com/Technogroovyindia
With Best Regard's
Technogroovy Systems India Pvt. Ltd.
www.technogroovy.com
Call- +91-9582888121
Whatsapp- +91-8800718323
Visit https://www.vlsiuniverse.com/
https://www.vlsiuniverse.com/2020/05/complete-asic-design-flow.html
This is the standard VLSI design flow that every semiconductor company follows. The complete ASIC design flow is explained by considering each and every stage.
Blooms Taxonomy in Engineering EducationA B Shinde
The objective of this presentation is to create awareness among the aspirants regarding the Blooms Taxonomy, how it can be related to define Course objectives and outcomes as well as to assess the students level
This presentation covers the basic guidelines regarding how to face the interview including resume writing, aptitude test, group discussion and facing interview confidently...
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Fundamentals of Electric Drives and its applications.pptx
SOC Chip Basics
1. CHIP BASICS
TIME, AREA, POWER, RELIABILITY &
CONFIGURABILITY
Mr. A. B. Shinde
Assistant Professor,
Electronics Engineering,
PVPIT, Budhgaon, Sangli
shindesir.pvp@gmail.com
2. Contents…
• Introduction,
• Cycle Time,
• Die Area and Cost,
• Ideal and Practical Scaling,
• Power,
• Area–Time–Power Trade-Offs in
Processor Design,
• Reliability,
• Configurability
2
3. Introduction
• The trade-off (balance achieved between two desirables but incompatible features)
between cost and performance is fundamental to any system design.
• The Semiconductor Industry Association (SIA) regularly makes
projections, called the SIA road map, of technology advances.
• Advances in lithography, makes the transistors smaller.
• The minimum width of the transistor gates is defined by the process
technology.
3
Table refers to process technology generations in terms of nanometers; older
generations are referred to in terms of microns ( μ m).
4. Design Trade - Offs
• In making basic design trade-offs, we have five different considerations.
1. First is Time: Which includes partitioning instructions into events or
cycles, basic pipelining mechanisms used in speeding up the
instruction execution
2. Second, is Area: The cost or area occupied by a particular feature is
another important aspect of the architectural trade-off.
3. Third, Power Consumption: It affects both performance and
implementation. Instruction sets that require more implementation area
are less valuable than instruction sets that use less area.
4. Fourth, Reliability: Comes into play to cope (deal) with deep
submicron effects.
5. Fifth, Configurability: Provides an additional opportunity for designers
to trade-off recurring and nonrecurring design costs.
4
5. Design Trade - Offs
• In terms of complexity, various trade - offs are possible.
• For instance, area can be traded off for performance.
• Very large scale integration (VLSI) complexity theory have shown that
bound exists for processor designs.
• It is also possible to trade-off time T for power P.
• Figure shows the possible trade-off involving area, time, and power in a
processor design.
5
Processor design trade - offs
6. Requirements and Specifications
• The five basic SOC trade - offs provide a framework for analyzing
SOC requirements so that these can be translated into specifications.
• Cost requirements coupled with market size can be translated into die
cost and process technology.
• Requirements for wearable and weight put limit bounds on power or
energy consumption.
• Limitations on clock frequency, can affect heat dissipation.
• Any one of the trade - off criteria for a particular design, have the highest
priority.
6
7. Requirements and Specifications
• Consider some examples:
• High - performance systems will optimize time at the expense of cost
and power.
• Low - cost systems will optimize die cost, reconfigurability, and design
reuse.
• Wearable systems stress low power (since, the power supply determines
the system weight). e.g. cell phones.
• Embedded systems in planes and other safety - critical applications would
stress reliability, with performance and design lifetime being important
secondary considerations.
• Gaming systems would stress cost (specially production cost,
secondarily, performance).
7
8. Cycle Time
• The time receives considerable attention from processor designers.
• It is the basic measure of performance;
however, breaking actions into cycles and reducing both cycle count and
cycle times are important but not preferable.
• The way in which actions are partitioned into cycles is important.
• A common problem is having unanticipated “extra” cycles required
by a basic action such as a cache miss.
8
9. Cycle Time
• Defining a Cycle:
• A cycle (of the clock) is the basic time unit for processing information.
• In a synchronous systems, the clock rate is a fixed value and the
cycle time is determined by finding the maximum time to accomplish
a frequent operation in the machine, such as an add or register data
transfer.
• Cycle time must be sufficient for data to be stored into a specified
destination register.
9
Possible sequence of actions within a cycle
10. Cycle Time
• A cycle begins when the instruction decoder specifies the values
for the registers in the system.
• These control values connect the output of a specified register to
another register or an adder or similar object.
• This allows data from source registers to propagate through
designated combinatorial logic into the destination register.
• Finally, after a suitable setup time, all registers are sampled by an
edge or pulse produced by the clocking system.
10
11. Cycle Time
• In a synchronous system:
• The cycle time is determined by the sum of the worst - case time for
each step or action within the cycle.
• However, the clock itself may not arrive at the anticipated time (due
to propagation or loading effects).
• We call the maximum deviation from the expected time of clock arrival
the (uncontrolled) clock skew.
11
12. Cycle Time
• In an asynchronous system:
• The cycle time is simply determined by the completion of an event
or operation.
• A completion signal is generated, which then allows the next
operation to begin.
• Asynchronous design is generally not used within pipelined
processors because of the pipeline timing constraints.
12
13. Cycle Time
• Optimum Pipeline:
• At one time, the concept of pipelining in a processor was treated as
an advanced processor design technique.
• From several decades, pipelining has been an integral part of any
processor or controller design.
• The trade - off between cycle time and number of pipeline stages is
treated in the section on optimum pipeline.
13
14. Cycle Time
• Optimum Pipeline:
• A basic optimization for the pipeline processor designer is the
partitioning of the pipeline into concurrently operating segments.
• A large number of segments allow a maximum speedup.
However, each new segment carries clocking overhead with it, which
can adversely affect performance.
• If we ignore the problem of fitting actions into an integer number of
cycles, we can derive an optimal cycle time, Δt, and
hence the level of segmentation for a simple pipelined processor.
14
15. Cycle Time
• (a) Unclocked instruction execution time, T .
• (b) T is partitioned into S segments. Each segment requires C clocking
overhead.
• (c) Clocking overhead and its effect on cycle time, T / S .
• (d) Effect of a pipeline disruption (or a stall in the pipeline).
15
Optimal pipelining.
16. Cycle Time
• Optimum Pipeline:
• Total time required to execute an instruction without pipeline segments is
T nanoseconds.
• Here, we need to find the optimum number of segments S to allow
clocking and pipelining.
• The ideal delay through a segment is Tseg.
Tseg = T/S =
Partitioning overhead is associated with each segment.
• This clock overhead time C (nS), includes clock skew, setup & hold
times of register.
• Now, the actual cycle time (Figure c) of the pipelined processor is the
ideal cycle time T / S + overhead:
16
17. Cycle Time
• Optimum Pipeline:
• In Ideal pipelined processor, there will not be any delays, but certain
delays can occur due to unexpected branches.
• Suppose, such delays (interruptions) occur with frequency b and have
the effect of invalidating the (S − 1) instructions prepared to enter, or
already in the pipeline (figure d)
• The performance of the processor is:
17
18. Cycle Time
• Optimum Pipeline:
• The throughput ( G ) can be calculated as
18
If we find the S for which
we can find Sopt, the optimum number of pipeline segments
19. Cycle Time
• Optimum Pipeline:
• The total instruction execution latency ( Tinstr ) is
19
We can compute the throughput performance G in mips.
Suppose T = 12.0 ns and b = 0.2, C = 0.5 ns.
Then, Sopt = 10 stages.
Determining Sopt can serve as:
A design starting point or
As an important check on an optimized design.
20. Die Area and Cost
• Cycle time, machine organization, and memory configuration determine
machine performance.
• Determining performance is relatively straightforward when compared to
the determination of overall cost.
• A good design achieves an optimum cost – performance trade - off at a
particular target performance. This determines the quality of a processor
design.
20
21. Die Area and Cost
• Processor Area:
• SOCs usually have die sizes of about
10 – 15 mm.
• This die is produced in bulk from a
larger wafer, 30 cm in diameter.
• Unfortunately, neither the silicon wafers
nor processing technologies are
perfect.
• Defects randomly occur over the
wafer surface.
21
22. Die Area and Cost
• Processor Area:
• Large chip areas require an
absence of defects over that area.
• If chips are too large for a
particular processing technology,
there will be little or no yield
(good chips produced in a manufacturing
process).
• Figure illustrates yield versus chip
area.
22
23. Die Area and Cost
• Processor Area:
23
• Example:
Find the die yield for dies that are 1.5 cm on a side and 1.0 cm on a
side, assuming a defect density of 0.4 per cm 2 and α is 4.
• Answer:
The total die areas are 2.25 cm 2 and 1.00 cm 2 . For the larger die, the
yield is
That is, less than half of all the large die are good but more than two-
thirds of the small die are good.
24. Die Area and Cost
• Processor Area:
24
Number of die (of area A ) on
a wafer of diameter d .
25. Die Area and Cost
• Processor Area:
• Suppose a die with square aspect ratio has area A. About N of these
dice can be realized in a wafer of diameter d:
25
• Now suppose there are NG good chips and ND point defects on the
wafer.
• Even if ND > N , we can expect several good chips since the defects are
randomly distributed and several defects would cluster on defective
chips, sparing a few goodones.
26. Die Area and Cost
• Processor Area:
• Suppose we add a random defect to a wafer; (NG / N) is the probability
that the defect destruct a good die.
• If the defect hits an bad die, it would cause no change to the number of
good die.
• In other words, the change in the number of good die (NG), with respect
to the change in the number of defects (ND), is
26
On Integrating and solving
27. Die Area and Cost
• Processor Area:
• To evaluate C, note that when NG = N , then ND = 0; so, C must be ln (N).
• Then the yield is
27
This describes a Poisson distribution of defects. If ρD is the defect
density per unit area, then
For large wafers d >> A, the diameter of the wafer is significantly larger
than the die side and
and
so that
28. Die Area and Cost
• Processor Area:
• Figure shows the projected
number of good die as a
function of die area for several
defect densities.
• Modern fab facility would have
ρD between 0.15 – 0.5.
• Doubling the die area has a
significant effect on yield.
28
29. Ideal and Practical Scaling
• As feature sizes shrink and transistors gets smaller, the transistor
density will improve.
• Similarly, transistor delay (or gate delay) should decrease linearly
with feature size.
• Practical scaling is different as wire delay, and wire density does not
scale at the same rate as transistors scale.
• Wire delay remains almost constant as feature sizes shrink.
29
30. Ideal and Practical Scaling
• Figure illustrates the increasing dominance of wire delay over gate
delay.
30
The dominance of wire
delay over gate delay.
31. Ideal and Practical Scaling
• Scaling factor of 1.5 is commonly considered more accurate.
• Major technology changes can affect scaling in a discontinuous
manner.
• The simple scaling of a design might only scale as 1.5, but a new
implementation taking advantage of all technology features could
scale at 2.
31
32. Ideal and Practical Scaling
• Baseline SOC Area Model:
• The key factor to design efficient system is chip floor planning.
• Each functional area of the processor must be allocated sufficient
space for its implementation.
• Functional units that frequently communicate must be placed close
together. Sufficient room must be allocated for connection paths.
• Baseline system can be used to illustrate possible trade - offs in
optimizing the chip floorplan.
• This model is based upon observations made of existing chips and
design experience
32
33. Ideal and Practical Scaling
• Baseline SOC Area Model:
• Starting Point: The design process
begins with an understanding of the
parameters of the semiconductor
process.
• Suppose we expect to be able to use
a manufacturing process that has a
defect density of 0.2, defect per
square centimeter; for economic
reasons, we target an initial yield of
about 95%:
33
where ρD = 0.2 defect per square centimeter, Y = 0.95. Then
approximately 0.25 cm2
34. Ideal and Practical Scaling
• Baseline SOC Area Model:
• So the chip area available to us is 25
mm2 .
• This is the total die area of the chip,
• but such things as pads for the wire
bonds that connect the chip to the
external world, drivers for these
connections, and power supply
lines all act to decrease the
amount of chip area available to the
designer.
• Suppose we allow 12% of the chip
area to accommodate these
functions (usually around the periphery
of the chip), then the net area will be
22 mm2
34
35. Ideal and Practical Scaling
• Baseline SOC Area Model:
• Feature Size: The smaller the feature size, the more logic that can
be accommodated within a fixed area.
• At feature size, f = 65 nm, we have about 5200 A or area units in 22
mm2
• The Architecture: Each system has different objectives.
• For example, assume that we need the following:
– A small 32 - bit core processor with an 8 KB I - cache and a 16 KB D -
cache;
– Two 32 - bit vector processors
– Memory; an 8 KB I - cache and a 16 KB D - cache for scalar data;
– A bus control unit;
– Directly addressed application memory of 128 KB ; and
– A shared L2 cache.
35
36. Ideal and Practical Scaling
• Baseline SOC Area Model:
• An Area Model: The following is a breakdown of the area required for
various units used in the system.
36
• Latches, Buses, and Interunit Control: For each of the functional
units, there is a certain amount of overhead to accommodate
nonspecific storage (latches), interunit communications (buses), and
interunit control.
• This is allocated as 10% overhead for latches and 40% overhead for
buses, routing, clocking, and overall control.
37. Ideal and Practical Scaling
• Baseline SOC Area Model:
• Total System Area: The designated processor elements and storage
occupy 2462 A . This leaves a net of 5200 − 2462 = 2738 A available
for cache.
• Cache Area: The net area available for cache is 2738 A .
• However, bits and pieces that may be unoccupied on the chip are not
always useful to the cache designer.
• These pieces must be collected into a reasonably compact area that
accommodates efficient cache designs.
37
38. Ideal and Practical Scaling
• Baseline SOC Area Model:
• An example baseline floor plan is shown in
figure.
• A summary of area design rules follow:
1. Compute the target chip size from the target
yield and defect density.
2. Compute the die cost and determine whether
it is satisfactory.
3. Compute the net available area. Allow 10 –
20% for pins, guard ring, power supplies, and
so on.
4. Determine the rbe (register bit equivalent)
size from the minimum feature size.
5. Allocate the area based on a trial system
architecture until the basic system size is
determined.
6. Subtract the basic system size (5) from the
net available area (3). This is the die area
available for cache and storage.
38
39. Power
• Growing demands for wireless and portable electronic appliances
have focused much attention on power consumption.
• The SIA road map points to increasingly higher power for
microprocessor chips because of their higher operating frequency,
higher overall capacitance, and larger size.
• Power scales indirectly with feature size (45 nm, 32nm 22 nm etc).
39
40. Power
• At the device level, total power dissipation (Ptotal) has two major
sources:
– dynamic or switching power and
– static power caused by leakage current:
40
Where C is the device capacitance;
V is the supply voltage;
freq is the device switching frequency; and
Ileakage is the leakage current.
Gate delays are roughly proportional to CV / (V − Vth )2 , where Vth is the
threshold voltage of the transistors.
41. Power
• As feature sizes decrease, so do device sizes.
• Smaller device sizes result in reduced capacitance.
• Decreasing the capacitance decreases both the dynamic power
consumption and the gate delays.
• As device sizes decreases, the electric field applied to them becomes
destructively large in quantity.
• To increase the device reliability, we need to reduce the supply
voltage V.
41
42. Power
• Reducing V effectively reduces the dynamic power consumption but
results in an increase in the gate delays.
• We can avoid this loss by reducing Vth.
• Reducing Vth increases the leakage current and hence, static power
consumption also increases.
• This has an important effect on design and production; there are two
device designs that must be accommodated in production:
1. The high - speed device with low Vth and high static power; and
2. The slower device maintaining Vth and low static power with increase
of circuit density .
42
43. Reliability
• The important design dimension is reliability, (dependability or fault
tolerance).
• Reliability is related to
– die area,
– clock frequency, and
– power.
• Die area increases the amount of circuitry and the probability of a fault.
• Higher clock frequencies increase electrical noise and noise sensitivity.
43
44. Reliability
• Faults, if detected, can be masked by
– error - correcting codes (ECCs),
– instruction retry, or
– functional reconfiguration.
• Some definitions:
1. A failure is a deviation from a design specification.
2. An error is a failure that results in an incorrect signal value
3. A fault is an error that manifests itself as an incorrect logical result.
4. A physical fault is a failure caused by the environment, such as aging,
radiation, temperature, or temperature cycling. The probability of
physical faults increases with time.
5. A design fault is a failure caused by a design implementation that is
inconsistent with the design specification.
44
45. Reliability
• Dealing with Manufacturing Faults:
• The traditional way of dealing with manufacturing faults is through
testing.
• As transistor density increases, the problem of testing increases even
faster.
• The testable combinations increase exponentially with transistor count.
45
46. Reliability
• Dealing with Manufacturing Faults:
• A technique to give testing access to interior (not accessible from the
instruction set) storage cells is called scan .
• A scan chain in its simplest form consists of a separate entry and exit
point from each storage cell.
• Scan allows predetermined data configurations to be entered into
storage, and the output of particular configurations can be compared
with known correct output configurations.
46
48. Application
X
X
+
-
FFT Butterfly
2-Stage Filter
X X
+ +
LUT D
LUT DLUT DLUT D
LUT DLUT DLUT D
LUT D LUT D
Coarse-grain Units
- Look Up Tables
- Flip Flops
- Adders, Multipliers, etc.
Multiplexers and Switches
Typical FPGAs
Configurability
48
49. Configurability
• Reconfigurable Design is used to:
• Reduce the Time: (Execution time)
• Reduce the Area: (Reuse the same area)
• Increase the reliability (Quality should not degrade over the time)
49