SlideShare a Scribd company logo
1 of 80
Download to read offline
Embedded Computing Systems
Unit – III
Text Book Used:
Wayne Wolf: Computers as Components,
Principles of Embedded Computing Systems
Design, 2nd Edition, Elsevier, 2008.
By
Dr. K. Satyanarayan Reddy
CiTECH, B’lore-36.
Bus-Based Computer Systems
THE CPU BUS: A computer system
comprises of the CPU; it also
includes memory and I/O
devices.
The bus is the mechanism by
which the CPU communicates
with Memory and Devices.
A Bus is, at a minimum, a
collection of wires, but the bus
also defines a protocol by which
the CPU, memory, and devices
communicate.
One of the major roles of the bus
is to provide an interface to
memory.
Bus Protocols: The basic building
block of most bus protocols is
the Four-cycle Handshake, as
shown in adjacent Figure :
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 2
The Four-cycle Handshake
Bus-Based Computer Systems cont’d….
Bus Protocols cont’d.:
1. Device 1 raises its output to signal an enquiry, which tells
device 2 that it should get ready to listen for data.
2. When device 2 is ready to receive, it raises its output to signal
an acknowledgment.
At this point, devices 1 and 2 can transmit or receive.
3. Once the data transfer is complete, device 2 lowers its output,
signaling that it has received the data.
4. After seeing that ack has been released, device 1 lowers its
output.
At the end of the handshake, both handshaking signals are low,
just as they were at the start of the handshake.
The system has thus returned to its original state in readiness
for another handshake-enabled data transfer.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 3
The term bus is used in 2 ways. A set of related wires, such as data/address wires also the term
may mean a protocol for communicating between components.
To avoid confusion, the term bundle will be used to refer to a set of related signals.
The fundamental bus operations are READING and WRITING.
Figure below shows the structure of a typical bus that supports reads and writes.
The major components follow:
■ Clock provides synchronization to the bus components,
■ R/W is true when the bus is reading and false when the bus is writing,
■ Address is an a-bit bundle of signals that transmits the address for an access,
■ Data is an n-bit bundle of signals that can carry data to or from the CPU, and
■ Data ready signals when the values on the data bundle are valid.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 4
Bus-Based Computer Systems cont’d….
A typical Microprocessor Bus
All transfers on this basic bus are controlled by the CPU, which can read or
write a device or memory, but devices or memory cannot initiate a transfer
on their own.
This is reflected by the fact that R/W and Address are unidirectional signals,
since only the CPU can determine the address and direction of the transfer.
The behavior of a bus is specified with a Timing Diagram, which shows how the
signals on a bus change over time, but since values like the address and data
can take on many values, some standard notation is used to describe signals,
as shown in Figure below:
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 5
Bus-Based Computer Systems cont’d….
A’s value is known at all times, so it is shown as a standard waveform that changes
between 0 and 1. B and C alternate between changing and stable states.
A stable signal has, as the name implies, a stable value that could be measured by an
oscilloscope.
e.g.: An address bus may be shown as stable when the address is present, but the
bus’s timing requirements are independent of the exact address on the bus.
A signal can go between a known 0/1 state and a stable/changing state.
A changing signal does not have a stable value. Changing signals should not be used
for computation.
To be sure that signals go to their proper values at the proper times, timing diagrams
sometimes show Timing Constraints.
The Timing Constraints are drawn in two different ways, depending on the amount of
time between events or on the order of events.
e.g.: The timing constraint from A to B, shows that A must go high before B becomes
stable.
The constraint from A to B also has a time value of 10 ns, indicating that A goes
high at least 10 ns before B goes stable.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 6
Bus-Based Computer Systems cont’d….
The adjacent figure shows a timing
diagram for the example bus.
The diagram shows a Read and a Write.
Timing Constraints are shown only for the
Read operation, but similar
constraints apply to the write
operation.
The bus is normally in the read mode
since that does not change the state
of any of the devices or memories.
Note: The direction of data transfer on
bidirectional lines is not specified in
the timing diagram.
During a read, the external device or
memory is sending a value on the
data lines, while during a write the
CPU is controlling the data lines.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 7
Timing Diagram for the Bus
Bus-Based Computer Systems cont’d….
The sequence of operations for a READ on the Timing Diagram as
follows:
■ A read or write is initiated by setting address enable high after the
clock starts to rise.
Setting R/W = 1 to indicate a read, and the address lines are set to
the desired address.
■ After 1 clock cycle, the memory or device is expected to assert the
data value at that address on the data lines.
Simultaneously, the external device specifies that the data are valid
by pulling down the data ready line.
This line is active low, meaning that a logically true value is indicated
by a low voltage, in order to provide increased immunity to
electrical noise.
■ The CPU is free to remove the address at the end of the clock cycle
and must do so before the beginning of the next cycle.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 8
Bus-Based Computer Systems cont’d….
The Handshake that tells the CPU and
Devices when data are to be
transferred is formed by data ready
for the acknowledge side, but is
implicit for the enquiry side.
Since the bus is normally in read mode,
“enq” does not need to be asserted,
but the “acknowledge” must be
provided by Data Ready.
The Data Ready signal allows the bus to
be connected to devices that are
slower than the bus.
As shown in adjacent Figure, the external
device need not immediately assert
data ready.
The cycles between the minimum time at
which data can be asserted and when
it is actually asserted are known as
Wait States. Wait states are
commonly used to connect slow,
inexpensive memories to buses.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 9
A wait state on a read operation
Bus-Based Computer Systems cont’d….
The bus handshaking signals can also be
used to perform Burst Transfers, as
illustrated in Figure on right.
In this Burst Read Transaction, the CPU
sends one address but receives a
sequence of data values.
Here an extra line is added to the bus,
called Burst9, which signals when a
transaction is actually a burst.
Releasing the burst9 signal tells the
device that enough data has been
transmitted.
To stop receiving data after the end of
data 4, the CPU releases the burst9
signal at the end of data 3 since the
device requires some time to
recognize the end of the burst.
Those values come from successive
memory locations starting at the
given address.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 10
A burst read transaction
Bus-Based Computer Systems cont’d….
Some buses provide Disconnected Transfers.
In these buses, the request and response are
separate.
A first operation requests the transfer.
The bus can then be used for other
operations.
The transfer is completed later, when the data
are ready.
The state machine view of the bus transaction
is also helpful and a useful complement to
the timing diagram.
Figure on right shows the CPU and device
state machines for the read operation.
As with a timing diagram, not all the possible
values of address and data lines are
shown, instead transitions of control
signals are dealt with.
When the CPU decides to perform a read
transaction, it moves to a new state,
sending bus signals that cause the device
to behave appropriately.
The device’s state transition graph captures its
side of the protocol.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 11
State diagrams for the bus read transaction
Bus-Based Computer Systems cont’d….
Some buses have Data Bundles that are smaller
than the word size of the CPU, thus using
fewer data lines reduces the cost of the
chip.
Byte addresses are sequentially sent over the
bus, receiving one byte at a time; the bytes
are assembled inside the CPU’s bus logic
before being presented to the CPU proper.
Some buses use multiplexed address and data.
As shown in Figure on right, additional control
lines are provided to tell whether the value
on the address/data lines is an address or
data.
Typically, the address comes first on the
combined address/data lines, followed by
the data.
The address can be held in a register until the
data arrive so that both can be presented
to the device (such as a RAM) at the same
time.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 12
Bus signals for multiplexing address and data
Bus-Based Computer Systems cont’d….
Direct Memory Access (DMA)
Standard bus transactions require the CPU to be in the middle of
every read and write transaction.
However, there are certain types of data transfers in which the CPU
does not need to be involved.
e.g.: A high-speed I/O device may wish to transfer a block of data
into memory.
This capability requires that some unit other than the CPU, to be
able to control operations on the bus.
Direct memory access (DMA) is a bus operation that allows reads
and writes not controlled by the CPU.
A DMA transfer is controlled by a DMA controller, which requests
control of the bus from the CPU.
After gaining control, the DMA controller performs read and write
operations directly between devices and memory.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 13
Figure below shows the configuration of a bus with a DMA controller.
The DMA requires the CPU to provide two additional bus signals:
■ The bus request is an input to the CPU through which DMA
controllers ask for ownership of the bus.
■ The bus grant signals that the bus has been granted to the
DMA controller.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 14
Direct Memory Access cont’d….
A device that can initiate its own bus transfer is known as a Bus Master.
The DMA controller uses bus request & bus grant signals to gain control of
the bus using a classic four-cycle handshake.
The bus request is asserted by the DMA controller when it wants to control
the bus, and the bus grant is asserted by the CPU when the bus is ready.
The CPU will finish all pending bus transactions before granting control of the
bus to the DMA controller.
When it does grant control, it stops driving the other bus signals: R/W,
address, and so on.
Upon becoming Bus Master, the DMA controller has control of all bus signals
and it can perform reads and writes using the same bus protocol as with
any CPU-driven bus transaction.
Memory and devices do not know whether a read or write is performed by
the CPU or by a DMA controller.
After the transaction is finished, the DMA controller returns the bus to the
CPU by de-asserting the bus request, causing the CPU to de-assert the bus
grant.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 15
Direct Memory Access cont’d….
The CPU controls the DMA operation through registers
in the DMA controller.
A typical DMA controller includes the following three
registers:
■ A starting address register specifies where the
transfer is to begin.
■ A length register specifies the number of words
to be transferred.
■ A status register allows the DMA controller to be
operated by the CPU.
The CPU initiates a DMA transfer by setting the starting
address and length registers appropriately and
then writing the status register to set its start
transfer bit.
After the DMA operation is complete, the DMA
controller interrupts the CPU to tell it that the
transfer is done.
The CPU’s role during a DMA transfer: As the CPU
cannot use the bus.
As shown in adjacent Figure 4.10, if the CPU has
enough instructions and data in the cache and
registers, it may be able to continue doing useful
work for quite some time oblivious of the DMA
transfer.
But once the CPU needs the bus, it stalls until the DMA
controller returns bus mastership to the CPU.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 16
Direct Memory Access cont’d….
UML sequence diagram of system
activity around a DMA transfer
System Bus Configurations
A microprocessor system generally has more than one bus.
As shown in Figure below, high-speed devices may be connected to a high-performance bus, while lower-
speed devices are connected to a different bus.
A small block of logic known as a Bridge allows the buses to connect to each other.
The advantage of using multiple buses and bridges are:
■ Higher-speed buses may provide wider data connections.
■ A high-speed bus usually requires more expensive circuits and connectors. The cost of low-speed
devices can be held down by using a lower-speed, lower-cost bus.
■ The bridge may allow the buses to operate independently, thereby providing some parallelism in
I/O operations.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 17
A multiple bus system
Operation of a bus Bridge: The bridge is a slave on the fast bus and the master of the slow bus.
The bridge takes commands from the fast bus (on which it is a slave) and issues those commands on the
slow bus( of which it is a master).
It also returns the results from the slow bus to the fast bus; e.g.: It returns the results of a read on the
slow bus to the fast bus.
The upper sequence of states handles a write from the fast bus to the slow bus.
These states must read the data from the fast bus and set up the handshake for the slow bus.
Operations on the fast and slow sides of the bus bridge should be overlapped as much as possible to
reduce the latency of bus-to-bus transfers.
Similarly, the bottom sequence of states reads from the slow bus and writes the data to the fast bus.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 18
UML state diagram of
bus bridge operation
System Bus Configurations cont’d….
AMBA Bus
The AMBA bus supports CPUs, memories, and peripherals integrated in a system-on-silicon.
As shown in Figure below, the AMBA specification includes two buses. The AMBA High-
performance Bus (AHB) is optimized for high-speed transfers and is directly connected to
the CPU which supports several high-performance features: pipelining, burst transfers, split
transactions, and multiple bus masters.
A bridge can be used to connect the AHB to an AMBA Peripherals Bus (APB).
This bus is designed to be simple and easy to implement; it also consumes relatively little
power.
The AHB assumes that all peripherals act as slaves, simplifying the logic required in both the
peripherals and the bus controller. It also does not perform pipelined operations, which
simplifies the bus logic.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 19
Elements of the ARM AMBA bus system
Memory Device Organization: A memory is
characterized by its capacity, such as 256
MB.
e.g: A 256-MB memory may be available in
two versions:
■ As a 64M 4-bit array, a single memory
access obtains an 8-bit data item, with
a maximum of 226 different addresses.
■ As a 32M 8-bit array, a single memory
access obtains a 1-bit data item, with
a maximum of 223 different addresses.
The height/width ratio of a memory is known
as its Aspect Ratio.
The best aspect ratio depends on the amount
of memory required.
Internally, the data are stored in a two-
dimensional array of memory cells as
shown in adjacent Figure.
Because the array is stored in two dimensions,
the n-bit address received by the chip is
split into a row and a column address
(with n = r + c).
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 20
Internal organization of a memory device
MEMORY DEVICES
MEMORY DEVICES cont’d….
The row and column select a particular memory cell.
If the memory’s external width is 1 bit, the column
address selects a single bit; for wider data widths, the
column address can be used to select a subset of the
columns.
Most memories include an enable signal that controls
the tri-stating of data onto the memory’s pins.
A read/write signal (R/W in the figure) on read/write
memories controls the direction of data transfer;
memory chips do not typically have separate read and
write data pins.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 21
Random-Access Memories
Random Access memories can be both read and written. They are
called random access because addresses can be read in any order.
Most bulk memory in modern systems is dynamic RAM (DRAM).
DRAM is very dense; it does, however, require that its values be
refreshed periodically since the values inside the memory cells
decay over time.
The dominant form of dynamic RAM today is the synchronous DRAMs
(SDRAMs), which uses clocks to improve DRAM performance.
SDRAMs use Row Address Select (RAS) and Column Address Select
(CAS) signals to break the address into two parts, which select the
proper row and column in the RAM array.
Signal transitions are relative to the SDRAM clock, which allows the
internal SDRAM operations to be pipelined.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 22
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 23
As shown in adjacent Figure, transitions on
the control signals are related to a clock.
RAS and CAS can therefore become valid at
the same time.
The address lines are not shown in full
detail here; some address lines may not
be active depending on the mode in
use.
SDRAMs use a separate refresh signal to
control refreshing.
DRAM has to be refreshed roughly once per
millisecond and DRAMs refresh part of
the memory at a time instead of
refreshing the entire memory at once.
When a section of memory is being
refreshed, it cannot be accessed until
the refresh is complete.
The memory refresh occurs over fairly few
seconds so that each section is
refreshed every few microseconds.
Random-Access Memories cont’d….
Timing diagram for a read on a synchronous DRAM
Read-only memories (ROMs) are preprogrammed with fixed data.
They are very useful in embedded systems since a great deal of the code, and perhaps some
data, does not change over time.
There are several types of ROM available. The factory-programmed ROM (sometimes called
mask-programmed ROM) and field-programmable ROM.
Factory-programmed ROMs are ordered from the factory with particular programming.
ROMs can typically be ordered in lots of a few thousand, but clearly factory programming is
useful only when the ROMs are to be installed in some quantity.
Field-programmable ROMs, on the other hand, can be programmed in the lab.
Flash memory is the dominant form of field-programmable ROM and is electrically erasable.
Flash memory uses standard system voltage for erasing and programming, allowing it to be
reprogrammed inside a typical system.
Early flash memories had to be erased in their entirety; modern devices allow memory to be
erased in blocks.
Most flash memories today allow certain blocks to be protected, where the boot-up code is
kept and other memory blocks on the device can be updated. Such form of flash is
commonly known as Boot-block flash.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 24
Read Only Memories (ROM)
I/O DEVICES
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 25
Timers and Counters: Timers and counters are
distinguished largely on the basis of their usage, not
their logic.
Both are built from adder logic with registers to hold the
current value, with an increment input that adds one to
the current register value.
However, a Timer has its count connected to a periodic
clock signal to measure time intervals, while a Counter
has its count input connected to an aperiodic signal in
order to count the number of occurrences of some
external event.
Because the same logic can be used for either purpose, the
device is often called a Counter/Timer.
The adjacent Figure shows enough of the internals
of a Counter/Timer to illustrate its operation.
An n-bit counter/timer uses an n-bit register to
store the current state of the count and an
array of half subtractors to decrement the
count when the count signal is asserted.
Combinational logic checks when the count equals
zero; the done output signals the zero count.
It is often useful to be able to control the time-out,
rather than require exactly 2n events to occur.
For this purpose, a reset register provides the value
with which the count register is to be loaded.
The Counter/Timer provides logic to load the reset
register.
Most counters provide both cyclic and acyclic
modes of operation.
In the cyclic mode, once the counter reaches the
done state, it is automatically reloaded and the
counting process continues.
In acyclic mode, the counter/timer waits for an
explicit signal from the microprocessor to
resume counting.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 26
Internals of a Counter/Timer
I/O DEVICES cont’d….
A Watchdog Timer is an I/O device that is
used for internal operation of a system.
As shown in Figure, the Watchdog Timer is
connected into the CPU bus and also to
the CPU’s reset line.
The CPU’s software is designed to
periodically reset the watchdog timer,
before the timer ever reaches its time-
out limit.
If the watchdog timer ever does reach that
limit, its time-out action is to reset the
processor.
In that case, the presumption is that either
a Software Flaw or Hardware Problem
has caused the CPU to misbehave.
Rather than diagnosing the problem, the
system is reset to get it operational as
quickly as possible.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 27
I/O DEVICES cont’d….
A Watchdog Timer
A/D and D/A Converters
ANALOG/DIGITAL (A/D) and Digital/Analog (D/A) converters (typically
known as ADCs and DACs, respectively) are often used to interface
non digital devices to embedded systems.
Because A/D conversion requires more complex circuitry, it requires a
somewhat more complex interface.
Analog/digital conversion requires sampling the analog input before
converting it to digital form.
A control signal causes the A/D converter to take a sample and digitize
it.
A typical A/D interface has, in addition to its analog inputs, two major
digital inputs.
A Data Port allows A/D registers to be read and written, and a Clock
Input tells when to start the next conversion.
D/A conversion is relatively simple, so the D/A converter interface
generally includes only the data value.
The input value is continuously converted to analog form.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 28
Keyboards
A keyboard is basically an array of switches, but it may include some internal logic to help
simplify the interface to the microprocessor.
A switch uses a mechanical contact to make or break an electrical circuit.
The major problem with mechanical switches is that they bounce as shown in Figure below.
When the switch is depressed by pressing on the button attached to the switch’s arm, the
force of the depression causes the contacts to bounce several times until they settle down.
If this is not corrected, it will appear that the switch has been pressed several times, giving
false inputs.
A hardware debouncing circuit can be built using a one-shot timer. Software can also be used
to debounce switch inputs. A raw keyboard can be assembled from several switches.
Each switch in a raw keyboard has its own pair of terminals, making raw keyboards impractical
when a large number of keys is required.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 29
Switch Bouncing
More expensive keyboards, such as those used in PCs,
actually contain a microprocessor to preprocess
button inputs.
PC keyboards typically use a 4-bit microprocessor to
provide the interface between the keys and the
computer. The microprocessor can provide
debouncing, but it also provides other functions as
well.
An encoded keyboard uses some code to represent
which switch is currently being depressed. At the
heart of the encoded keyboard is the scanned
array of switches shown in adjacent Figure.
Unlike a raw keyboard, the scanned keyboard array
reads only one row of switches at a time.
The demultiplexer at the left side of the array selects
the row to be read. When the scan input is 1, that
value is transmitted to one terminal of each key in
the row.
If the switch is depressed, the 1 is sensed at that
switch’s column. Since only one switch in the
column is activated, that value uniquely identifies a
key.
The row address and column output can be used for
encoding, or circuitry can be used to give a
different encoding.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 30
A Scanned Key Array
Keyboards cont’d….
There are 2 problems associated with encoding the keyboard listed as
follows:
1. Combinations of keys may not be represented.
e.g.: On a PC keyboard, the encoding must be chosen so that
combinations such as control-Q can be recognized and sent to the PC.
2. Rollover may not be allowed.
e.g.: if “a” is pressed and then “b” is pressed before releasing “a,” in
most applications there is need to send an “a” followed by a “b” through
the keyboard.
Rollover is very common in typing at even modest rates.
A naive implementation of the encoder circuitry will simply throw away
any character depressed after the first one until all the keys are released.
The keyboard microcontroller can be programmed to provide n-key rollover,
so that rollover keys are sensed, put on a stack, and transmitted in
sequence as keys are released.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 31
Keyboards cont’d….
Light Emitting Diodes (LEDs)
LED’s are often used as simple displays by themselves, and arrays of
LEDs may form the basis of more complex displays.
Figure below shows how to connect an LED to a digital output.
A resistor is connected between the output pin and the LED to absorb
the voltage difference between the digital output voltage and the
0.7 V drop across the LED.
When the digital output goes to 0, the LED voltage is in the device’s off
region and the LED is not on.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 32
An LED connected to a digital output
Displays
A display device may be either directly driven
or driven from a frame buffer.
The displays with a small number of elements
are driven directly by logic, while large
displays use a RAM frame buffer.
The n-digit array, shown in adjacent Figure, is
a simple example of a display that is
usually directly driven.
A single-digit display typically consists of
seven segments; each segment may be
either an LED or a Liquid Crystal Display
(LCD) element.
This display relies on the digits being visible
for some time after the drive to the digit is
removed, which is true for both LEDs and
LCDs.
The digit input is used to choose which digit is
currently being updated, and the selected
digit activates its display elements based
on the current data value.
The display’s driver is responsible for
repeatedly scanning through the digits
and presenting the current value of each
to the display.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 33
An n-digit Display
A Frame Buffer is a RAM that is attached to the system bus.
The microprocessor writes values into the frame buffer in whatever
order is desired.
The pixels in the frame buffer are generally written to the display in
raster order by reading pixels sequentially.
Many large displays are built using LCD. Each pixel in the display is
formed by a single liquid crystal.
LCD displays present a very different interface to the system
because the array of pixel LCDs can be randomly accessed.
Modern LCD panels use an active matrix system that puts a
transistor at each pixel to control access to the LCD.
Early LCD panels were called passive matrix because they relied on a
two-dimensional grid of wires to address the pixels.
Active matrix displays provide higher contrast and a higher-quality
display.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 34
Displays cont’d….
Touchscreens
A Touchscreen is an input device overlaid on an output device. The Touchscreen registers the position of a
touch to its surface. By overlaying this on a display, the user can react to information shown on the
display.
The 2 most common types of touchscreens are Resistive and Capacitive.
Resistive Touchscreen: It uses a 2D voltmeter to sense position. As shown in Figure below, the touchscreen
consists of two conductive sheets separated by spacer balls.
The top conductive sheet is flexible so that it can be pressed to touch the bottom sheet. A voltage is
applied across the sheet; its resistance causes a voltage gradient to appear across the sheet.
The top sheet samples the conductive sheet’s applied voltage at the contact point.
An Analog/Digital Converter is used to measure the voltage and resulting position.
The touchscreen alternates between x and y position sensing by alternately applying horizontal and
vertical voltage gradients.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 35
Cross section of a Resistive Touchscreen
COMPONENT INTERFACING
Memory Interfacing: The memory structure will be simple, if a memory
is bought which is of the exact size that is needed.
If more memory is needed than that can be bought in a single chip,
then several such memory chips are needed to construct the
memory of required size.
e.g. if 4GB Memory is needed and the single memory chip is available
in 2GB then 2 Memory chips are needed.
To build a memory that is wider than the one that can bought on a
single chip.
e.g. A 32-bit-wide memory chip cannot be bought generally, a memory
of a given width can easily be constructed (32 bits, 64 bits, etc.) by
placing RAMs in parallel.
Also LOGIC may be needed to turn the Bus Signals into the appropriate
memory signals. So appropriate refresh signals need to be
generated.
e.g. Most busses won’t send address signals in row and column form.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 36
Device Interfacing: Some I/O devices are designed
to interface directly to a particular bus, forming
glue-less interfaces.
But glue logic is required when a device is
connected to a bus for which it is not designed.
An I/O device typically requires a much smaller
range of addresses than a memory, so
addresses must be decoded much more
accurately.
Some additional logic is required to cause the bus
to read and write the device’s registers.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 37
COMPONENT INTERFACING cont’d….
The device has four registers that can be read and
written by presenting the register number on
the regid pins, asserting R/W as required, and
reading or writing the value on the regval pins.
To interface to the bus, the bottom two bits of the
address are used to refer to registers within the
device, and the remaining bits are used to
identify the device itself.
The top bits of the address are sent to a comparator
for testing against the device address.
The device’s address can be set with switches to
allow the address to be easily changed.
When the bus address matches the device’s, the
result is used to enable a transceiver for the
data pins.
When the transceiver is disabled, the regval pins
are disconnected from the data bus.
The comparator’s output is also used to modify the
R/W signal: The device’s R/W pin is given the
value (bus R/W + not-equal address), so that
when the comparator’s result is not 1, the
device’s R/W pin always receives a 1 to avoid
inadvertently writing the device registers.
A glue logic interface: Below
is an interfacing scheme
for a simple I/O device
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 38
COMPONENT INTERFACING cont’d….
System Architecture: An Architecture is a set of elements and the
relationships between them that together form a single unit.
The architecture of an embedded computing system is the blueprint for
implementing that system it gives an information about the components
needed and how they are put together. It includes both hardware and
software elements.
It includes several elements, some of which may be less obvious than others.
■ CPU An embedded computing system clearly contains a
microprocessor.
There are many different architectures, and even within an
architecture there are models that vary in clock speed, bus data
width, integrated peripherals, and so on.
The choice of the CPU is one of the most important, also the
software that will execute on the machine.
■ Bus The choice of a bus is closely tied to that of a CPU, since the bus is
an integral part of the microprocessor.
But in applications that make intensive use of the bus due to I/O or
other data traffic, the bus may be more of a limiting factor than the
CPU.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 39
DESIGNING WITH MICROPROCESSORS
System Architecture cont’d….
■ Memory The most obvious characteristic of the memory is its total size,
which depends on both the required data volume and the size of the
program instructions.
The ratio of ROM to RAM and selection of DRAM versus SRAM can have a
significant influence on the cost of the system.
The speed of the memory plays a great role in determining system
performance.
■ Input and Output devices: For a given function, there may be several
different devices of varying sophistication and cost that can do the job for
the CPU.
These devices are called the I/O devices based on fact whether such
device is being used for input or output operation.
e.g. A set of switches and knobs on a front panel may all be controlled by a
single microcontroller, which is in turn connected to the main CPU.
The difficulty of using a particular device, such as the amount of glue logic
required to interface it, may also play a role in final device selection.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 40
Hardware Design
Step – 1: Consider evaluation boards supplied by the microprocessor manufacturer or another company working in
collaboration with the manufacturer.
Evaluation boards are sold for many microprocessor systems; they typically include the CPU, some memory, a serial link
for downloading programs, and some minimal number of I/O devices.
Figure below shows an ARM evaluation board manufactured by Sharp. The evaluation board may be a Complete Solution
or provide what is needed with only slight modifications. If the evaluation board is supplied by the microprocessor
vendor, its design may be available from the vendor;
If the evaluation board comes from a third party, it may be possible to contract them to design a new board with the
required modifications, or start from scratch on a new board design.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 41
Step-II: The other major task is the choice of memory and
peripheral components.
In the case of I/O devices, there are two alternatives for each
device: selecting a component from a catalog or designing
from scratch.
When shopping for devices from a catalog, it is important to
read data sheets carefully; it may not be trivial to figure out
whether the device does what it is intended for.
Also due consideration must be given to the amount of glue
logic required to connect the device to the bus.
Simple peripheral logic can be implemented in
Programmable Logic Devices (PLDs), while more complex
units can be built from Field-programmable Gate Arrays
(FPGAs).
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 42
Hardware Design cont’d….
The PC as a Platform
Personal computers are often used as platforms
for embedded computing.
Advantages of a PC: it is a predesigned hardware
platform with a great many features, a wide
variety of I/O devices can be attached to it,
and it provides a rich programming
environment.
Disadvantage: PC is larger, more power hungry,
and more expensive than a custom hardware
platform would be.
However, for low-volume applications and
environments such as factories and offices
where size and power are not critical, using a
PC to build an embedded system often
makes a lot of sense.
As shown in adjacent Figure, a typical PC includes
several major hardware components:
■ The CPU provides basic computational
facilities.
■ RAM is used for program storage.
■ ROM holds the boot program.
■ A DMA controller provides DMA
capabilities.
■ Timers are used by the operating system for
a variety of purposes.
■ A High-speed Bus, connected to the CPU
bus through a bridge, allows fast devices
to communicate efficiently with the rest
of the system.
■ A Low-speed Bus provides an inexpensive
way to connect simpler devices and may
be necessary for backward compatibility
as well.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 43
Hardware architecture of a typical PC
PCI (Peripheral Component Interconnect)
PCI is the High-performance system bus which uses High-speed data
transmission techniques and efficient protocols to achieve high throughput.
The original PCI standard allowed operation up to 33 MHz; at that rate, a
maximum transfer rate of 264 MB/s can be achieved using 64-bit transfers.
The revised PCI standard allows the bus to run up to 66 MHz, giving a maximum
transfer rate of 524 MB/s with 64-bit wide transfers.
PCI uses wide buses with many data and address bits along with multiple
control bits. The width of the PCI bus increases both the cost of an interface
to the bus and makes the physical connection to the bus more complicated.
PCI also allows devices to be chained together so that users need not worry
about the order of devices on the bus or other details of connection.
USB (Universal Serial Bus) and IEEE 1394 are the two major high-speed serial
buses. Both of these buses offer high transfer rates using simple connectors.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 44
Basic Input / Output System (BIOS)
A PC provides a standard software platform that interfaces to the
underlying hardware as well as more advanced services.
At the bottom of the software platform structure in most PCs is a
minimal set of software in ROM.
This software is designed to load the complete operating system
from some other device (disk, network, etc.), and it may also
provide low-level hardware interfaces.
In the IBM-compatible PC, the low-level software is known as the
Basic Input / Output System (BIOS).
The BIOS provides low-level hardware drivers as well as booting
facilities.
The operating system provides high-level drivers, control of
executing processes, user interfaces, and so on.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 45
System organization of the Intel StrongARM SA-1100 and SA-1111
The StrongARM SA-1100 provides a number of functions besides the ARM CPU.
The chip contains two on-chip buses: a high-speed system bus and a lower-speed peripheral bus.
The chip also uses two different clocks. A 3.686 MHz clock is used to drive the CPU and high-speed
peripherals, and a 32.768 kHz clock is an input to the system control module.
The system control module contains the following peripheral devices:
■ A real-time clock
■ An operating system timer
■ 28 general-purpose I/Os (GPIOs)
■ An interrupt controller
■ A power manager controller
■ A reset controller that handles resetting the processor.
The 32.768 kHz clock’s frequency is chosen to
be useful in timing real-time events.
The slower clock is also used by the power
manager to provide continued operation of
the manager at a lower clock rate and
therefore lower power consumption.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 46
DEVELOPMENT AND DEBUGGING
Development Environments: A typical embedded computing system has a relatively small amount of
everything, including CPU horsepower, memory, I/O devices, and so forth.
As a result, it is common to do at least part of the software development on a PC or workstation known as
a host as illustrated in Figure below.
The hardware on which the code will finally run is known as the Target.
The host and target are frequently connected by a USB link, but a higher-speed link such as Ethernet can
also be used.
The target must include a small amount of software to talk to the host system.
That software will take up some memory, interrupt vectors, and so on, but it should generally leave the
smallest possible footprint in the target to avoid interfering with the application software.
The host should be able to do the following:
■ load programs into the target,
■ start and stop program execution on the target, and
■ examine memory and CPU registers.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 47
Connecting a host and a target system
A Cross-compiler is a compiler that runs on one type of machine but
generates code for another.
After compilation, the executable code is downloaded to the embedded
system by a serial link or perhaps burned in a PROM and plugged in.
Host-target debuggers are often used, in which the basic hooks for debugging
are provided by the target and a more sophisticated user interface is
created by the host.
A PC or workstation offers a programming environment which is much
friendlier than the typical embedded computing platform.
Problem with this approach emerges when debugging code talks to I/O
devices, as the host will not have the same devices configured in the same
way, the embedded code cannot be run as is done on the host.
A Test-bench program can be built to help debug the embedded code.
The Test-bench generates inputs to simulate the actions of the input devices;
it may also take the output values and compare them against expected
values, providing valuable early debugging help.
The embedded code may need to be slightly modified to work with the
Testbench, but careful coding (such as using the #ifdef directive in C) can
ensure that the changes can be undone easily and without introducing
bugs.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 48
DEVELOPMENT AND DEBUGGING cont’d….
Debugging Techniques (S/W based)
A Software Debugging can be done by Compiling and Executing
the code on a PC or workstation.
But at some point it inevitably becomes necessary to run code on
the embedded hardware platform.
Embedded systems are usually less friendly programming
environments than PCs but, the resourceful designer has
several options available for debugging the system.
The serial port found on most evaluation boards is one of the
most important debugging tools.
It is a good idea to design a serial port into an embedded system
even if it is not likely to be used in the final product; the serial
port can be used not only for development debugging but also
for diagnosing problems in the field.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 49
Another very important debugging tool is the
Breakpoint.
The simplest form of a Breakpoint is for the user to
specify an address at which the program’s execution
is to break.
When the PC reaches that address, control is returned
to the monitor program.
From the monitor program, the user can examine
and/or modify CPU registers, after which execution
can be continued.
Implementing breakpoints does not require using
exceptions or external devices.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 50
Debugging Techniques (S/W based) cont’d….
Following Programming Example shows how to use instructions to create
breakpoints.
Breakpoints: A breakpoint is a location in memory at which a program stops
executing and returns to the debugging tool or monitor program.
Implementing breakpoints is very simple, it only requires replacement of the
instruction at the breakpoint location with a subroutine call to the monitor.
In the following code, to establish a breakpoint at location 0x40c in some ARM code,
the branch (B) instruction is replaced and is normally held at that location with a
subroutine call (BL) to the breakpoint handling routine:
When the breakpoint handler is called, it saves all the registers and can then display
the CPU state to the user and take commands.
To continue execution, the original instruction must be replaced in the program.
If the breakpoint can be erased, the original instruction can simply be replaced and
control returned to that instruction.
This will normally require fixing the subroutine return address, which will point to the
instruction after the breakpoint.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 51
Debugging Techniques (S/W based) cont’d….
When Software Tools are insufficient to debug the system, Hardware
aids can be deployed to give a clearer view of what is happening
when the system is running.
The microprocessor In Circuit Emulator (ICE) is a specialized hardware
tool that can help debug software in a working embedded system.
An ICE is a special version of the microprocessor that allows its internal
registers to be read out when it is stopped.
The In-circuit Emulator surrounds this specialized microprocessor with
additional logic that allows the user to specify breakpoints and
examine and modify the CPU state.
The CPU provides as much debugging functionality as a debugger
within a monitor program, but does not take up any memory.
Drawback of In-circuit Emulation: The machine is specific to a
particular microprocessor, even down to the pinout.
If several microprocessors are used, maintaining a fleet of In-circuit
Emulators to match can be very expensive.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 52
Debugging Techniques (H/W based) cont’d….
The Logic Analyzer is the other major piece of instrumentation in the
embedded system designer’s arsenal.
Think of a logic analyzer as an array of inexpensive oscilloscopes; the
analyzer can sample many different signals simultaneously (tens to
hundreds) but can display only 0, 1, or changing values for each.
All these logic analysis channels can be connected to the system to
record the activity on many signals simultaneously.
The logic analyzer records the values on the signals into an internal
memory and then displays the results on a display once the memory
is full or the run is aborted.
The logic analyzer can capture thousands or even millions of samples
of data on all of these channels, providing a much larger time
window into the operation of the machine than is possible with a
conventional oscilloscope.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 53
Debugging Techniques (H/W based) cont’d….
A typical Logic Analyzer can acquire data in either of two modes that are
typically called State and Timing modes.
The measurement resolution on each signal is reduced in both voltage and
time dimensions.
The reduced voltage resolution is accomplished by measuring logic values (0,
1, x) rather than analog voltages.
The reduction in Timing resolution is accomplished by sampling the signal,
rather than capturing a continuous waveform as in an analog oscilloscope.
State and timing mode represent different ways of sampling the values.
Timing mode uses an Internal Clock that is fast enough to take several
samples per clock period in a typical system.
State mode, uses the System’s own Clock to control sampling, so it samples
each signal only once per clock cycle.
As a result, timing mode requires more memory to store a given number of
system clock cycles.
On the other hand, it provides greater resolution in the signal for detecting
glitches.
Timing mode is typically used for glitch-oriented debugging, while state mode
is used for sequentially oriented problems.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 54
Debugging Techniques (H/W based) cont’d….
The Internal Architecture of a logic analyzer is shown in Figure below.
The system’s data signals are sampled at a latch within the logic analyzer; the latch is controlled by either the
system clock or the internal logic analyzer sampling clock, depending on whether the analyzer is being used
in state or timing mode.
Each sample is copied into a vector memory under the control of a state machine.
The latch, timing circuitry, sample memory, and controller must be designed to run at high speed since several
samples per system clock cycle may be required in timing mode.
After the sampling is complete, an embedded microprocessor takes over to control the display of the data
captured in the sample memory.
Logic analyzers typically provide a number of formats for viewing data. One format is a timing diagram format.
Many logic analyzers allow not only customized displays, such as giving names to signals, but also more advanced
display options.
For example, an inverse assembler can be used to turn vector values into microprocessor instructions.
The logic analyzer does not provide access to
the internal state of the components, but it
does give a very good view of the externally
visible signals.
That information can be used for both
Functional and timing debugging.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 55
Architecture of a Logic Analyzer
Debugging Techniques (H/W based) cont’d….
Debugging Challenges
Logical errors in software can be hard to track down, but errors in real-time code can
create problems that are even harder to diagnose.
Real-time programs are required to finish their work within a certain amount of time;
if they run too long, they can create very unexpected behavior.
Example below demonstrates one of the problems that can arise.
A timing error in real-time code: To make it easier to compare input to output and
see the results of the bug, assuming that the computation produces an output
equal to the input, but that a bug causes the computation to run 50% longer than
its given time interval.
A sample input to the program over several sample periods follows:
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 56
If the program ran fast enough to meet its deadline, the output would simply be a time
shifted copy of the input.
But when the program runs over its allotted time, the output will become very different.
The behavior of the A/D and D/A converters is unpredictable make some assumptions like
.
First, the A/D converter holds its current sample in a register until the next sample period,
and the D/A converter changes its output whenever it receives a new sample.
Next, a reasonable assumption about interrupt systems is that, when an interrupt is not
satisfied and the device interrupts again, the device’s old value will disappear and be
replaced by the new value.
The basic situation that develops when the interrupt routine runs too long is something
like this:
1. The A/D converter is prompted by the timer to generate a new value, saves it in
the register, and requests an interrupt.
2. The interrupt handler runs too long from the last sample.
3. The A/D converter gets another sample at the next period.
4. The interrupt handler finishes its first request and then immediately responds to
the second interrupt. It never sees the first sample and only gets the second one.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 57
Debugging Challenges cont’d….
Thus, assuming that the Interrupt Handler takes 1.5 times longer than it should, here is
how it would process the sample input:
• Input sample
 Output sample
The output waveform is seriously distorted because the interrupt routine grabs the wrong
samples and puts the results out at the wrong times.
The exact results of missing real-time deadlines depend on the detailed characteristics of
the I/O devices and the nature of the timing violation.
This makes debugging real-time problems especially difficult and if a system exhibits truly
unusual behavior, missed deadlines should be suspected.
In-circuit emulators, logic analyzers, and even LEDs can be useful tools in checking the
execution time of real-time code to determine whether it in fact meets its deadline.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 58
Debugging Challenges cont’d….
SYSTEM-LEVEL PERFORMANCE ANALYSIS
SYSTEM-LEVEL PERFORMANCE involves much more than the CPU.
Though focus is on often the CPU because it processes instructions, but any part of the system
can affect total system performance.
More precisely, the CPU provides an upper bound on performance, but any other part of the
system can slow down the CPU. Merely counting instruction execution times is not
enough.
Consider the simple system of Figure below. Data needs to be moved from memory to the CPU
to process it.
To get the data from memory to the CPU following must be done:
■ read from the memory;
■ transfer over the bus to the cache; and
■ transfer from the cache to the CPU.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 59
System level Data Flows and Performance
The time required to transfer from the cache to the CPU is included in the
instruction execution time, but the other two times are not.
The most basic measure of performance is Bandwidth— the rate at which the
data can be moved.
The point of interest is real-time performance measured in seconds.
But often the simplest way to measure performance is in units of clock cycles.
However, different parts of the system will run at different clock rates.
So, it has to be ensured that the right clock rate is applied to each part of the
performance estimate while converting clock cycles to seconds.
For simplicity, consider the bandwidth provided by only one system
component, the bus.
Consider an image of 320240 pixels, with each pixel composed of 3 bytes of
data. This gives a grand total of 230, 400 bytes of data.
If these images are video frames, then it is to be checked if one frame can be
pushed through the system within the 1/30s that a frame has to be
processed before the next one arrives.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 60
SYSTEM-LEVEL PERFORMANCE ANALYSIS cont’d….
Let the bus clock period be P and the bus width be W.
Putting W in units of bytes (other measures of width could be used as well).
Then to write formulas for the time required to transfer N bytes of data.
We will write our basic formulas in units of bus cycles T , then convert those bus cycle
counts to real time t using the bus clock period P:
t = TP. (4.1)
As shown in Figure below, a basic bus transfer transfers a W-wide set of bytes.
The data transfer itself takes D clock cycles. (Ideally, D = 1, but a memory that
introduces wait states is one example of a transfer that could require D > 1
cycles.)
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 61
SYSTEM-LEVEL PERFORMANCE ANALYSIS cont’d….
Times and data volumes in a basic bus transfer
Addresses, handshaking, and other activities constitute overhead that may occur
before (O1) or after (O2) the data.
For simplicity, let the overhead be summed into O = O1 + O2.
This gives a total transfer time in clock cycles of:
Tbasic(N) = (D + O) . N/W ………………………………. (4.2)
As shown in Figure below, a burst transaction performs B transfers of W bytes each.
Each of those transfers will require D clock cycles. The bus also introduces O cycles of
overhead per burst. This gives
Tburst(N) = (B.D + O). N / (BW) ……………………………... (4.3)
Transferring data into and out of components also raises questions of bandwidth. The
simplest illustration of this problem is memory.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 62
SYSTEM-LEVEL PERFORMANCE ANALYSIS cont’d….
Times and data volumes in a burst bus transfer
A single memory chip is not solely specified by the number of bits it can hold.
As shown in Figure below, memories of the same size can have different Aspect Ratios.
e.g: A 64-MB memory that is 1-bit-wide will present 64 million addresses of 1-bit data. The same size
memory in a 4-bit-wide format will have 16 distinct addresses and an 8-bit-wide memory will have 8
million distinct addresses.
Memory chips do not come in extremely wide aspect ratios but wider memories can be built by using
several chips.
The memory system width may also be determined by the memory modules used. Rather than buy memory
chips individually, memory as SIMMs or DIMMs may be bought.
Which aspect ratio is preferable for the overall memory system depends also on the format of the data
needs to be stored in the memory and the speed with which it must be accessed, giving rise to
bandwidth analysis.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 63
SYSTEM-LEVEL PERFORMANCE ANALYSIS cont’d….
if the data types do not fit naturally into the width of the memory.
Let color video pixels need to be stored in the memory.
A standard pixel is 38-bit color values (say red, green, blue).
A 24-bit-wide memory would allow to read or write an entire pixel value in
one access.
An 8-bit-wide memory, in contrast, would require three accesses for the pixel.
If a 32-bit-wide memory is there then there are 2 main choices:
1. One byte of each transfer could be wasted or
2. Use that byte to store unrelated data, or the pixels can be packed.
In the 2nd case, the first read would get all of the first pixel and one byte of
the second pixel; the second transfer would get the last two bytes of the
second pixel and the first two bytes of the third pixel; and so forth.
The total number of accesses A required to read E data elements of w bits
each out of a memory of width W is:
A = [(E/w) mod W] + 1 …………………………………. (4.4)
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 64
SYSTEM-LEVEL PERFORMANCE ANALYSIS cont’d….
Performance bottlenecks in a bus-based system
Consider a simple bus-based system: data has to be transferred
between the CPU and the memory over the bus.
We need to be able to read a 320 X 240 video frame into the CPU at
the rate of 30 frames/s, for a total of 612,000 bytes/s.
Which will be the bottleneck and limit system performance: the bus or
the memory?
Let’s assume that the bus has a 1-MHz clock rate (period of 10-6 sec)
and is 2 bytes wide, with D = 1 and O = 3.
This gives a total transfer time of
Tbasic = (1 + 3).612,000/2 = 1,224,000 cycles ……………….(4.5)
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 65
t = Tbasic · P = 1,224,000 · 1 x 10-6 = 1.224 sec ………………………………
(4.6)
Since the total time to transfer one second’s worth of frames is more
than 1s, the bus is not fast enough for our application.
The memory provides a burst mode with B = 4 but is only 4 bits wide,
giving W = 0.5.
For this memory, D = 1 and O = 4. The clock period for this memory is
107 s. Then
Tmem = (4 · 1 + 4).612,000/(4 x 0.5) = 2,448,000 cycles ……… (4.7)
t = Tmem · P = 2,448,000 · 1 x 10-7 = 0.2448 sec ………………..(4.8)
The memory requires < 1s to transfer the 30 frames that must be
transmitted in 1s, so it is fast enough.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 66
Performance bottlenecks in a bus-based system
Parallelism
When different components of the
system operate in parallel, more
work can be done in a given
amount of time.
Direct Memory Access is a prime
example of parallelism, DMA was
designed to off-load memory
transfers from the CPU.
The CPU can do other useful work
while the DMA transfer is running.
Figure below shows the paths of
data transfers without and with
DMA when transferring from
memory to a device.
Without DMA, the data must go
through the CPU; the CPU cannot
do useful work at that time.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 67
DMA transfers and parallelism
The CPU is tied up for the amount of time required for the
bus transfer.
Since buses often operate at slower clock rates than the
CPU, that time can be considerable.
The system performance can be increased significantly by
overlapping operations on the different units of the
system.
The timing diagrams of adjacent Figure shows timing
diagram for two versions of a computation.
The top timing diagram shows activity in the system when
the CPU first performs some setup operations, then
waits for the bus transfer to complete, then resumes
its work.
In the bottom timing diagram, the program on the CPU has
been rewritten so that its main work is broken into
two sections.
In this case, once the first transfer is done, the CPU can
start working on that data.
Meanwhile, due to DMA, the second transfer happens on
the bus at the same time.
Once that data arrives and the first calculation is finished,
the CPU can go on to the second part of the
computation.
The result is that the entire computation finishes
considerably earlier than in the sequential case.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 68
Sequential and parallel schedules in a bus-based system
Parallelism cont’d….
Design Example : ALARM CLOCK
Requirements: the adjacent Figure
illustrates the front panel design for
the alarm clock.
The time is shown as four digits in 12-h
format; a light has been used to
distinguish between AM and PM.
Several buttons are used to set the clock
time and alarm time.
When the hour and minute buttons are
pressed, the hour and minute is
advanced, respectively, by one.
When setting the time, the set time
button must be held down while the
hour and minute buttons are hit; the
set alarm button works in a similar
fashion.
With the alarm on and alarm off buttons,
the alarm is turned on and off.
When the alarm is activated, the alarm
ready light is on. A separate speaker
provides the audible alarm.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 69
Front panel of the alarm clock
The Requirements Table:
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 70
Design Example : ALARM CLOCK cont’d….
The adjacent Figure 1 shows the basic classes for the
alarm clock.
Calling the class that handles the basic clock operation
the Mechanism class (based on a term from
mechanical watches).
Three classes are there representing physical elements:
Lights* for all the digits and lights,
Buttons* for all the buttons, and
Speaker* for the sound output.
The Buttons* class can easily be used directly by
Mechanism.
The physical display must be scanned to generate the
digits output, so the Display class is introduced to
abstract the physical lights.
The details of the low-level user interface classes are
shown in Figure 2.
The Buzzer* class allows the buzzer to be turned off;
analog electronics will be used to generate the buzz
tone for the speaker.
The Buttons* class provides read-only access to the
current state of the buttons.
The Lights* class allows to drive the lights.
For saving the pins on the display, Lights* provides signals
for only one Digit, along with a set of signals to
indicate which digit is currently being addressed.
Class diagram for the alarm clock
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 71
Details of low-level class for the
alarm clock
Design Example : ALARM CLOCK cont’d….
Figure 2
Figure 1
Specification
The display is generated by scanning the digits periodically, this function is performed by the Display
class, which makes the display appear as an un-scanned, continuous display to the rest of the
system.
The Mechanism class is described in Figure below.
This class keeps track of the current time, the current alarm time, whether the alarm has been
turned on, and whether it is currently buzzing.
The clock shows the time only to the minute, but it keeps internal time to the second.
The time is kept as discrete digits rather than a single integer to simplify transferring the time to the
display.
The class provides two behaviors, both of which run continuously.
I. Scan-keyboard is responsible for looking at the inputs and updating the alarm and other
functions as requested by the user.
II. Update-time keeps the current time
accurate.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 72
The Mechanism Class
Adjacent Figure shows the state
diagram for update-time.
This behavior is straightforward,
but it must do several things.
It is activated once per second and
must update the seconds clock.
If it has counted 60 s, it must then
update the displayed time;
when it does so, it must roll
over between digits and keep
track of AM-to-PM and PM-to-
AM transitions.
It sends the updated time to the
display object.
It also compares the time with the
alarm setting and sets the
alarm buzzing under proper
conditions.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 73
Specification cont’d….
State diagram for
update-time
The state diagram for scan-keyboard is shown in
adjacent Figure .
This function is called periodically, frequently enough
so that all the user’s button presses are caught by
the system.
Because the keyboard will be scanned several times
per second and the same button press need not
be registered several times.
e.g.: the minutes count is advanced on every
keyboard scan when the set-time and minutes
buttons were pressed, the time would be
advanced much too fast.
To make the buttons respond more reasonably, the
function computes button activations; it
compares the current state of the button to the
button’s value on the last scan, and it considers
the button activated only when it is on for this
scan but was off for the last scan.
Once computing the activation values for all the
buttons, it looks at the activation combinations
and takes the appropriate actions.
Before exiting, it saves the current button values for
computing activations the next time this behavior
is executed.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 74
State diagram for scan-keyboard
Specification cont’d….
The system has both Periodic and Aperiodic components; the current time must
obviously be updated periodically, and the button commands occur
occasionally.
The following 2 major software components can be present in the Architecture:
■ An Interrupt-driven Routine can update the current time.
The current time will be kept in a variable in memory.
A timer can be used to interrupt periodically and update the time.
The display must be sent the new value when the minute value changes.
This routine can also maintain the PM indicator.
■ A Foreground Program can poll the buttons and execute their commands.
Since buttons are changed at a relatively slow rate, it makes no sense to
add the hardware required to connect the buttons to interrupts.
Instead, the foreground program reads the button values and then use
simple conditional tests to implement the commands, including setting
the current time, setting the alarm, and turning off the alarm.
Another routine called by the foreground program will turn the buzzer on
and off based on the alarm time.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 75
System Architecture
The Foreground Code will be implemented as a while loop:
while (TRUE) {
read_buttons(button_values);/* read inputs */
process_command(button_values);/* do commands */
check_alarm();/* decide whether to turn on the alarm */
}
The loop first reads the buttons using read_buttons().
In addition to reading the current button values from the input device, this routine must preprocess the
button values so that the user interface code will respond properly.
As shown in Figure below, this can be done by performing a simple edge detection on the button input, the
button event value is 1 for one sample period when the button is depressed and then goes back to 0
and does not return to 1 until the button is depressed and then released.
This can be accomplished by a simple
two-state machine.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 76
System Architecture cont’d….
Preprocessing button inputs
The process_command() function is responsible for responding to
button events.
The function checks the current time against the alarm time and
decides when to turn on the buzzer.
This check_alarm() routine is kept separate from the Command
Processing Code since the alarm must go on when the proper time
is reached, independent of the button inputs.
From the software architecture it can be seen that a timer needs to be
connected to the CPU. Also a logic to connect the buttons to the
CPU bus will be needed.
Finally, before starting to write code and build hardware, draw the
State Transition Graph for the clock’s commands.
That diagram will be used to guide the implementation of the software
components.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 77
System Architecture cont’d….
Component Design and Testing
The 2 major software components, the Interrupt Handler and the Foreground
Code, can be implemented relatively straightforwardly.
As the functionality of the Interrupt Handler is in the interruption process
itself, that code is best tested on the Microprocessor Platform.
The Foreground Code can be more easily tested on the PC or workstation
used for code development.
A testbench can be created for this code which generates button depressions
to exercise the state machine.
the advancement of the system clock also needs to be simulated.
A better testing strategy for Interrupt Handler is to add testing code that
updates the clock, perhaps once per four iterations of the foreground
while loop.
The Timer taken care this way, the focus can thus be on implementing logic to
interface to the buttons, display, and buzzer.
The buttons will require debouncing logic.
The display will require a register to hold the current display value in order to
drive the display elements.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 78
System Integration and Testing
Because this system has a small number of
components, system integration is relatively easy.
The software must be checked to ensure that
debugging code has been turned off.
Three types of Tests can be performed.
1. The clock’s accuracy can be checked against a
reference clock.
2. The commands can be exercised from the
buttons.
3. The buzzer’s functionality should be verified.
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 79
THANK YOU
October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 80

More Related Content

What's hot

Software Engineering - Ch1
Software Engineering - Ch1Software Engineering - Ch1
Software Engineering - Ch1
Siddharth Ayer
 
Part I:Introduction to assembly language
Part I:Introduction to assembly languagePart I:Introduction to assembly language
Part I:Introduction to assembly language
Ahmed M. Abed
 
Principios electricos y aplicaciones digitalesl sesion 1
Principios electricos y aplicaciones digitalesl sesion 1Principios electricos y aplicaciones digitalesl sesion 1
Principios electricos y aplicaciones digitalesl sesion 1
Rodolfo Alcantara Rosales
 

What's hot (20)

Autosar software component
Autosar software componentAutosar software component
Autosar software component
 
Herramientas case1.1
Herramientas case1.1Herramientas case1.1
Herramientas case1.1
 
Intel IA 64
Intel IA 64Intel IA 64
Intel IA 64
 
Estandares de calidad aplicadas al software
Estandares de calidad aplicadas al softwareEstandares de calidad aplicadas al software
Estandares de calidad aplicadas al software
 
Moprosoft
MoprosoftMoprosoft
Moprosoft
 
Software Engineering - Ch1
Software Engineering - Ch1Software Engineering - Ch1
Software Engineering - Ch1
 
Operadores en c_ (1)
Operadores en c_ (1)Operadores en c_ (1)
Operadores en c_ (1)
 
LECT 1: ARM PROCESSORS
LECT 1: ARM PROCESSORSLECT 1: ARM PROCESSORS
LECT 1: ARM PROCESSORS
 
Important questions
Important questionsImportant questions
Important questions
 
Embedded systems
Embedded systemsEmbedded systems
Embedded systems
 
Part I:Introduction to assembly language
Part I:Introduction to assembly languagePart I:Introduction to assembly language
Part I:Introduction to assembly language
 
Capability Maturity Model
Capability Maturity ModelCapability Maturity Model
Capability Maturity Model
 
Introduction to AUTOSAR BSW (Base Software) & RTE (Real-Time Environment)
Introduction to  AUTOSAR BSW (Base Software) & RTE (Real-Time Environment)Introduction to  AUTOSAR BSW (Base Software) & RTE (Real-Time Environment)
Introduction to AUTOSAR BSW (Base Software) & RTE (Real-Time Environment)
 
Principios electricos y aplicaciones digitalesl sesion 1
Principios electricos y aplicaciones digitalesl sesion 1Principios electricos y aplicaciones digitalesl sesion 1
Principios electricos y aplicaciones digitalesl sesion 1
 
Arm programmer's model
Arm programmer's modelArm programmer's model
Arm programmer's model
 
UART Communication
UART CommunicationUART Communication
UART Communication
 
Automative basics v3
Automative basics v3Automative basics v3
Automative basics v3
 
Funciones del Procesador
Funciones del ProcesadorFunciones del Procesador
Funciones del Procesador
 
Motherboard con tecnología BTX
Motherboard con tecnología BTX Motherboard con tecnología BTX
Motherboard con tecnología BTX
 
UNIDAD 1: SISTEMAS OPERATIVOS EN AMBIENTES DISTRIBUIDOS
UNIDAD 1: SISTEMAS OPERATIVOS EN AMBIENTES DISTRIBUIDOSUNIDAD 1: SISTEMAS OPERATIVOS EN AMBIENTES DISTRIBUIDOS
UNIDAD 1: SISTEMAS OPERATIVOS EN AMBIENTES DISTRIBUIDOS
 

Similar to Ecs 7th sem-cse-unit-3

Chapter 3
Chapter 3Chapter 3
Chapter 3
PRADEEP
 
CH03 COMBUTER 000000000000000000000.pptx
CH03 COMBUTER 000000000000000000000.pptxCH03 COMBUTER 000000000000000000000.pptx
CH03 COMBUTER 000000000000000000000.pptx
227567
 

Similar to Ecs 7th sem-cse-unit-3 (20)

Es notes unit 2
Es notes unit 2Es notes unit 2
Es notes unit 2
 
Introduction to Processor
Introduction to ProcessorIntroduction to Processor
Introduction to Processor
 
01buses
01buses01buses
01buses
 
Isa bus nptel
Isa bus nptelIsa bus nptel
Isa bus nptel
 
Computer Architecture Chapter 2 BUS
Computer Architecture Chapter 2 BUSComputer Architecture Chapter 2 BUS
Computer Architecture Chapter 2 BUS
 
Chapter 3
Chapter 3Chapter 3
Chapter 3
 
01buses ver2_
01buses  ver2_01buses  ver2_
01buses ver2_
 
Module-3 The embedded computing platfrom and program design.pdf
Module-3 The embedded computing platfrom and program design.pdfModule-3 The embedded computing platfrom and program design.pdf
Module-3 The embedded computing platfrom and program design.pdf
 
CH03 COMBUTER 000000000000000000000.pptx
CH03 COMBUTER 000000000000000000000.pptxCH03 COMBUTER 000000000000000000000.pptx
CH03 COMBUTER 000000000000000000000.pptx
 
Computer maintenance & IT support service
Computer maintenance & IT support serviceComputer maintenance & IT support service
Computer maintenance & IT support service
 
UNIT 3.pptx
UNIT 3.pptxUNIT 3.pptx
UNIT 3.pptx
 
Bus interconnection
Bus interconnectionBus interconnection
Bus interconnection
 
IS 139 Lecture 5
IS 139 Lecture 5IS 139 Lecture 5
IS 139 Lecture 5
 
Computer Organization & Architecture.ppt
Computer Organization & Architecture.pptComputer Organization & Architecture.ppt
Computer Organization & Architecture.ppt
 
businterconnection ppt.pptx
businterconnection ppt.pptxbusinterconnection ppt.pptx
businterconnection ppt.pptx
 
Serial Port Device Driver
Serial Port Device DriverSerial Port Device Driver
Serial Port Device Driver
 
Bindura university of science education
Bindura university of science educationBindura university of science education
Bindura university of science education
 
Computer structurepowerpoint
Computer structurepowerpointComputer structurepowerpoint
Computer structurepowerpoint
 
Chap2 comp architecture
Chap2 comp architectureChap2 comp architecture
Chap2 comp architecture
 
Bus structure in Computer Organization.pdf
Bus structure in Computer Organization.pdfBus structure in Computer Organization.pdf
Bus structure in Computer Organization.pdf
 

Recently uploaded

一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
uodye
 
Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证
怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证
怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证
ehyxf
 
一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证
一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证
一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证
wpkuukw
 
CRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptx
CRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptxCRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptx
CRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptx
Rishabh332761
 
Abortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in Dammam
Abortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in DammamAbortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in Dammam
Abortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in Dammam
ahmedjiabur940
 
一比一定(购)国立南方理工学院毕业证(Southern毕业证)成绩单学位证
一比一定(购)国立南方理工学院毕业证(Southern毕业证)成绩单学位证一比一定(购)国立南方理工学院毕业证(Southern毕业证)成绩单学位证
一比一定(购)国立南方理工学院毕业证(Southern毕业证)成绩单学位证
wpkuukw
 
在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一
在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一
在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一
ougvy
 
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
ehyxf
 
一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理
一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理
一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理
uodye
 
一比一定(购)坎特伯雷大学毕业证(UC毕业证)成绩单学位证
一比一定(购)坎特伯雷大学毕业证(UC毕业证)成绩单学位证一比一定(购)坎特伯雷大学毕业证(UC毕业证)成绩单学位证
一比一定(购)坎特伯雷大学毕业证(UC毕业证)成绩单学位证
wpkuukw
 
Mankhurd Call Girls, 09167354423 Mankhurd Escorts Services,Mankhurd Female Es...
Mankhurd Call Girls, 09167354423 Mankhurd Escorts Services,Mankhurd Female Es...Mankhurd Call Girls, 09167354423 Mankhurd Escorts Services,Mankhurd Female Es...
Mankhurd Call Girls, 09167354423 Mankhurd Escorts Services,Mankhurd Female Es...
Priya Reddy
 
一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理
一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理
一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理
uodye
 
Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...
Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...
Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...
drmarathore
 
怎样办理阿德莱德大学毕业证(Adelaide毕业证书)成绩单留信认证
怎样办理阿德莱德大学毕业证(Adelaide毕业证书)成绩单留信认证怎样办理阿德莱德大学毕业证(Adelaide毕业证书)成绩单留信认证
怎样办理阿德莱德大学毕业证(Adelaide毕业证书)成绩单留信认证
ehyxf
 
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
tufbav
 
怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证
怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证
怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证
ehyxf
 

Recently uploaded (20)

一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
一比一原版(Otago毕业证书)奥塔哥理工学院毕业证成绩单学位证靠谱定制
 
Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In Palghar [ 7014168258 ] Call Me For Genuine Models W...
 
怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证
怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证
怎样办理昆士兰大学毕业证(UQ毕业证书)成绩单留信认证
 
Guwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime Guwahati
Guwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime GuwahatiGuwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime Guwahati
Guwahati Escorts Service Girl ^ 9332606886, WhatsApp Anytime Guwahati
 
一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证
一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证
一比一定(购)UNITEC理工学院毕业证(UNITEC毕业证)成绩单学位证
 
CRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptx
CRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptxCRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptx
CRISIS COMMUNICATION presentation=-Rishabh(11195)-group ppt (4).pptx
 
Abortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in Dammam
Abortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in DammamAbortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in Dammam
Abortion Pill for sale in Riyadh ((+918761049707) Get Cytotec in Dammam
 
Critical Commentary Social Work Ethics.pptx
Critical Commentary Social Work Ethics.pptxCritical Commentary Social Work Ethics.pptx
Critical Commentary Social Work Ethics.pptx
 
一比一定(购)国立南方理工学院毕业证(Southern毕业证)成绩单学位证
一比一定(购)国立南方理工学院毕业证(Southern毕业证)成绩单学位证一比一定(购)国立南方理工学院毕业证(Southern毕业证)成绩单学位证
一比一定(购)国立南方理工学院毕业证(Southern毕业证)成绩单学位证
 
在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一
在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一
在线制作(ANU毕业证书)澳大利亚国立大学毕业证成绩单原版一比一
 
Hilti's Latest Battery - Hire Depot.pptx
Hilti's Latest Battery - Hire Depot.pptxHilti's Latest Battery - Hire Depot.pptx
Hilti's Latest Battery - Hire Depot.pptx
 
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
怎样办理圣芭芭拉分校毕业证(UCSB毕业证书)成绩单留信认证
 
一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理
一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理
一比一原版(USYD毕业证书)澳洲悉尼大学毕业证如何办理
 
一比一定(购)坎特伯雷大学毕业证(UC毕业证)成绩单学位证
一比一定(购)坎特伯雷大学毕业证(UC毕业证)成绩单学位证一比一定(购)坎特伯雷大学毕业证(UC毕业证)成绩单学位证
一比一定(购)坎特伯雷大学毕业证(UC毕业证)成绩单学位证
 
Mankhurd Call Girls, 09167354423 Mankhurd Escorts Services,Mankhurd Female Es...
Mankhurd Call Girls, 09167354423 Mankhurd Escorts Services,Mankhurd Female Es...Mankhurd Call Girls, 09167354423 Mankhurd Escorts Services,Mankhurd Female Es...
Mankhurd Call Girls, 09167354423 Mankhurd Escorts Services,Mankhurd Female Es...
 
一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理
一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理
一比一维多利亚大学毕业证(victoria毕业证)成绩单学位证如何办理
 
Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...
Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...
Abort pregnancy in research centre+966_505195917 abortion pills in Kuwait cyt...
 
怎样办理阿德莱德大学毕业证(Adelaide毕业证书)成绩单留信认证
怎样办理阿德莱德大学毕业证(Adelaide毕业证书)成绩单留信认证怎样办理阿德莱德大学毕业证(Adelaide毕业证书)成绩单留信认证
怎样办理阿德莱德大学毕业证(Adelaide毕业证书)成绩单留信认证
 
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
怎样办理斯威本科技大学毕业证(SUT毕业证书)成绩单留信认证
 
怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证
怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证
怎样办理伍伦贡大学毕业证(UOW毕业证书)成绩单留信认证
 

Ecs 7th sem-cse-unit-3

  • 1. Embedded Computing Systems Unit – III Text Book Used: Wayne Wolf: Computers as Components, Principles of Embedded Computing Systems Design, 2nd Edition, Elsevier, 2008. By Dr. K. Satyanarayan Reddy CiTECH, B’lore-36.
  • 2. Bus-Based Computer Systems THE CPU BUS: A computer system comprises of the CPU; it also includes memory and I/O devices. The bus is the mechanism by which the CPU communicates with Memory and Devices. A Bus is, at a minimum, a collection of wires, but the bus also defines a protocol by which the CPU, memory, and devices communicate. One of the major roles of the bus is to provide an interface to memory. Bus Protocols: The basic building block of most bus protocols is the Four-cycle Handshake, as shown in adjacent Figure : October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 2 The Four-cycle Handshake
  • 3. Bus-Based Computer Systems cont’d…. Bus Protocols cont’d.: 1. Device 1 raises its output to signal an enquiry, which tells device 2 that it should get ready to listen for data. 2. When device 2 is ready to receive, it raises its output to signal an acknowledgment. At this point, devices 1 and 2 can transmit or receive. 3. Once the data transfer is complete, device 2 lowers its output, signaling that it has received the data. 4. After seeing that ack has been released, device 1 lowers its output. At the end of the handshake, both handshaking signals are low, just as they were at the start of the handshake. The system has thus returned to its original state in readiness for another handshake-enabled data transfer. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 3
  • 4. The term bus is used in 2 ways. A set of related wires, such as data/address wires also the term may mean a protocol for communicating between components. To avoid confusion, the term bundle will be used to refer to a set of related signals. The fundamental bus operations are READING and WRITING. Figure below shows the structure of a typical bus that supports reads and writes. The major components follow: ■ Clock provides synchronization to the bus components, ■ R/W is true when the bus is reading and false when the bus is writing, ■ Address is an a-bit bundle of signals that transmits the address for an access, ■ Data is an n-bit bundle of signals that can carry data to or from the CPU, and ■ Data ready signals when the values on the data bundle are valid. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 4 Bus-Based Computer Systems cont’d…. A typical Microprocessor Bus
  • 5. All transfers on this basic bus are controlled by the CPU, which can read or write a device or memory, but devices or memory cannot initiate a transfer on their own. This is reflected by the fact that R/W and Address are unidirectional signals, since only the CPU can determine the address and direction of the transfer. The behavior of a bus is specified with a Timing Diagram, which shows how the signals on a bus change over time, but since values like the address and data can take on many values, some standard notation is used to describe signals, as shown in Figure below: October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 5 Bus-Based Computer Systems cont’d….
  • 6. A’s value is known at all times, so it is shown as a standard waveform that changes between 0 and 1. B and C alternate between changing and stable states. A stable signal has, as the name implies, a stable value that could be measured by an oscilloscope. e.g.: An address bus may be shown as stable when the address is present, but the bus’s timing requirements are independent of the exact address on the bus. A signal can go between a known 0/1 state and a stable/changing state. A changing signal does not have a stable value. Changing signals should not be used for computation. To be sure that signals go to their proper values at the proper times, timing diagrams sometimes show Timing Constraints. The Timing Constraints are drawn in two different ways, depending on the amount of time between events or on the order of events. e.g.: The timing constraint from A to B, shows that A must go high before B becomes stable. The constraint from A to B also has a time value of 10 ns, indicating that A goes high at least 10 ns before B goes stable. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 6 Bus-Based Computer Systems cont’d….
  • 7. The adjacent figure shows a timing diagram for the example bus. The diagram shows a Read and a Write. Timing Constraints are shown only for the Read operation, but similar constraints apply to the write operation. The bus is normally in the read mode since that does not change the state of any of the devices or memories. Note: The direction of data transfer on bidirectional lines is not specified in the timing diagram. During a read, the external device or memory is sending a value on the data lines, while during a write the CPU is controlling the data lines. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 7 Timing Diagram for the Bus Bus-Based Computer Systems cont’d….
  • 8. The sequence of operations for a READ on the Timing Diagram as follows: ■ A read or write is initiated by setting address enable high after the clock starts to rise. Setting R/W = 1 to indicate a read, and the address lines are set to the desired address. ■ After 1 clock cycle, the memory or device is expected to assert the data value at that address on the data lines. Simultaneously, the external device specifies that the data are valid by pulling down the data ready line. This line is active low, meaning that a logically true value is indicated by a low voltage, in order to provide increased immunity to electrical noise. ■ The CPU is free to remove the address at the end of the clock cycle and must do so before the beginning of the next cycle. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 8 Bus-Based Computer Systems cont’d….
  • 9. The Handshake that tells the CPU and Devices when data are to be transferred is formed by data ready for the acknowledge side, but is implicit for the enquiry side. Since the bus is normally in read mode, “enq” does not need to be asserted, but the “acknowledge” must be provided by Data Ready. The Data Ready signal allows the bus to be connected to devices that are slower than the bus. As shown in adjacent Figure, the external device need not immediately assert data ready. The cycles between the minimum time at which data can be asserted and when it is actually asserted are known as Wait States. Wait states are commonly used to connect slow, inexpensive memories to buses. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 9 A wait state on a read operation Bus-Based Computer Systems cont’d….
  • 10. The bus handshaking signals can also be used to perform Burst Transfers, as illustrated in Figure on right. In this Burst Read Transaction, the CPU sends one address but receives a sequence of data values. Here an extra line is added to the bus, called Burst9, which signals when a transaction is actually a burst. Releasing the burst9 signal tells the device that enough data has been transmitted. To stop receiving data after the end of data 4, the CPU releases the burst9 signal at the end of data 3 since the device requires some time to recognize the end of the burst. Those values come from successive memory locations starting at the given address. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 10 A burst read transaction Bus-Based Computer Systems cont’d….
  • 11. Some buses provide Disconnected Transfers. In these buses, the request and response are separate. A first operation requests the transfer. The bus can then be used for other operations. The transfer is completed later, when the data are ready. The state machine view of the bus transaction is also helpful and a useful complement to the timing diagram. Figure on right shows the CPU and device state machines for the read operation. As with a timing diagram, not all the possible values of address and data lines are shown, instead transitions of control signals are dealt with. When the CPU decides to perform a read transaction, it moves to a new state, sending bus signals that cause the device to behave appropriately. The device’s state transition graph captures its side of the protocol. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 11 State diagrams for the bus read transaction Bus-Based Computer Systems cont’d….
  • 12. Some buses have Data Bundles that are smaller than the word size of the CPU, thus using fewer data lines reduces the cost of the chip. Byte addresses are sequentially sent over the bus, receiving one byte at a time; the bytes are assembled inside the CPU’s bus logic before being presented to the CPU proper. Some buses use multiplexed address and data. As shown in Figure on right, additional control lines are provided to tell whether the value on the address/data lines is an address or data. Typically, the address comes first on the combined address/data lines, followed by the data. The address can be held in a register until the data arrive so that both can be presented to the device (such as a RAM) at the same time. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 12 Bus signals for multiplexing address and data Bus-Based Computer Systems cont’d….
  • 13. Direct Memory Access (DMA) Standard bus transactions require the CPU to be in the middle of every read and write transaction. However, there are certain types of data transfers in which the CPU does not need to be involved. e.g.: A high-speed I/O device may wish to transfer a block of data into memory. This capability requires that some unit other than the CPU, to be able to control operations on the bus. Direct memory access (DMA) is a bus operation that allows reads and writes not controlled by the CPU. A DMA transfer is controlled by a DMA controller, which requests control of the bus from the CPU. After gaining control, the DMA controller performs read and write operations directly between devices and memory. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 13
  • 14. Figure below shows the configuration of a bus with a DMA controller. The DMA requires the CPU to provide two additional bus signals: ■ The bus request is an input to the CPU through which DMA controllers ask for ownership of the bus. ■ The bus grant signals that the bus has been granted to the DMA controller. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 14 Direct Memory Access cont’d….
  • 15. A device that can initiate its own bus transfer is known as a Bus Master. The DMA controller uses bus request & bus grant signals to gain control of the bus using a classic four-cycle handshake. The bus request is asserted by the DMA controller when it wants to control the bus, and the bus grant is asserted by the CPU when the bus is ready. The CPU will finish all pending bus transactions before granting control of the bus to the DMA controller. When it does grant control, it stops driving the other bus signals: R/W, address, and so on. Upon becoming Bus Master, the DMA controller has control of all bus signals and it can perform reads and writes using the same bus protocol as with any CPU-driven bus transaction. Memory and devices do not know whether a read or write is performed by the CPU or by a DMA controller. After the transaction is finished, the DMA controller returns the bus to the CPU by de-asserting the bus request, causing the CPU to de-assert the bus grant. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 15 Direct Memory Access cont’d….
  • 16. The CPU controls the DMA operation through registers in the DMA controller. A typical DMA controller includes the following three registers: ■ A starting address register specifies where the transfer is to begin. ■ A length register specifies the number of words to be transferred. ■ A status register allows the DMA controller to be operated by the CPU. The CPU initiates a DMA transfer by setting the starting address and length registers appropriately and then writing the status register to set its start transfer bit. After the DMA operation is complete, the DMA controller interrupts the CPU to tell it that the transfer is done. The CPU’s role during a DMA transfer: As the CPU cannot use the bus. As shown in adjacent Figure 4.10, if the CPU has enough instructions and data in the cache and registers, it may be able to continue doing useful work for quite some time oblivious of the DMA transfer. But once the CPU needs the bus, it stalls until the DMA controller returns bus mastership to the CPU. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 16 Direct Memory Access cont’d…. UML sequence diagram of system activity around a DMA transfer
  • 17. System Bus Configurations A microprocessor system generally has more than one bus. As shown in Figure below, high-speed devices may be connected to a high-performance bus, while lower- speed devices are connected to a different bus. A small block of logic known as a Bridge allows the buses to connect to each other. The advantage of using multiple buses and bridges are: ■ Higher-speed buses may provide wider data connections. ■ A high-speed bus usually requires more expensive circuits and connectors. The cost of low-speed devices can be held down by using a lower-speed, lower-cost bus. ■ The bridge may allow the buses to operate independently, thereby providing some parallelism in I/O operations. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 17 A multiple bus system
  • 18. Operation of a bus Bridge: The bridge is a slave on the fast bus and the master of the slow bus. The bridge takes commands from the fast bus (on which it is a slave) and issues those commands on the slow bus( of which it is a master). It also returns the results from the slow bus to the fast bus; e.g.: It returns the results of a read on the slow bus to the fast bus. The upper sequence of states handles a write from the fast bus to the slow bus. These states must read the data from the fast bus and set up the handshake for the slow bus. Operations on the fast and slow sides of the bus bridge should be overlapped as much as possible to reduce the latency of bus-to-bus transfers. Similarly, the bottom sequence of states reads from the slow bus and writes the data to the fast bus. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 18 UML state diagram of bus bridge operation System Bus Configurations cont’d….
  • 19. AMBA Bus The AMBA bus supports CPUs, memories, and peripherals integrated in a system-on-silicon. As shown in Figure below, the AMBA specification includes two buses. The AMBA High- performance Bus (AHB) is optimized for high-speed transfers and is directly connected to the CPU which supports several high-performance features: pipelining, burst transfers, split transactions, and multiple bus masters. A bridge can be used to connect the AHB to an AMBA Peripherals Bus (APB). This bus is designed to be simple and easy to implement; it also consumes relatively little power. The AHB assumes that all peripherals act as slaves, simplifying the logic required in both the peripherals and the bus controller. It also does not perform pipelined operations, which simplifies the bus logic. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 19 Elements of the ARM AMBA bus system
  • 20. Memory Device Organization: A memory is characterized by its capacity, such as 256 MB. e.g: A 256-MB memory may be available in two versions: ■ As a 64M 4-bit array, a single memory access obtains an 8-bit data item, with a maximum of 226 different addresses. ■ As a 32M 8-bit array, a single memory access obtains a 1-bit data item, with a maximum of 223 different addresses. The height/width ratio of a memory is known as its Aspect Ratio. The best aspect ratio depends on the amount of memory required. Internally, the data are stored in a two- dimensional array of memory cells as shown in adjacent Figure. Because the array is stored in two dimensions, the n-bit address received by the chip is split into a row and a column address (with n = r + c). October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 20 Internal organization of a memory device MEMORY DEVICES
  • 21. MEMORY DEVICES cont’d…. The row and column select a particular memory cell. If the memory’s external width is 1 bit, the column address selects a single bit; for wider data widths, the column address can be used to select a subset of the columns. Most memories include an enable signal that controls the tri-stating of data onto the memory’s pins. A read/write signal (R/W in the figure) on read/write memories controls the direction of data transfer; memory chips do not typically have separate read and write data pins. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 21
  • 22. Random-Access Memories Random Access memories can be both read and written. They are called random access because addresses can be read in any order. Most bulk memory in modern systems is dynamic RAM (DRAM). DRAM is very dense; it does, however, require that its values be refreshed periodically since the values inside the memory cells decay over time. The dominant form of dynamic RAM today is the synchronous DRAMs (SDRAMs), which uses clocks to improve DRAM performance. SDRAMs use Row Address Select (RAS) and Column Address Select (CAS) signals to break the address into two parts, which select the proper row and column in the RAM array. Signal transitions are relative to the SDRAM clock, which allows the internal SDRAM operations to be pipelined. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 22
  • 23. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 23 As shown in adjacent Figure, transitions on the control signals are related to a clock. RAS and CAS can therefore become valid at the same time. The address lines are not shown in full detail here; some address lines may not be active depending on the mode in use. SDRAMs use a separate refresh signal to control refreshing. DRAM has to be refreshed roughly once per millisecond and DRAMs refresh part of the memory at a time instead of refreshing the entire memory at once. When a section of memory is being refreshed, it cannot be accessed until the refresh is complete. The memory refresh occurs over fairly few seconds so that each section is refreshed every few microseconds. Random-Access Memories cont’d…. Timing diagram for a read on a synchronous DRAM
  • 24. Read-only memories (ROMs) are preprogrammed with fixed data. They are very useful in embedded systems since a great deal of the code, and perhaps some data, does not change over time. There are several types of ROM available. The factory-programmed ROM (sometimes called mask-programmed ROM) and field-programmable ROM. Factory-programmed ROMs are ordered from the factory with particular programming. ROMs can typically be ordered in lots of a few thousand, but clearly factory programming is useful only when the ROMs are to be installed in some quantity. Field-programmable ROMs, on the other hand, can be programmed in the lab. Flash memory is the dominant form of field-programmable ROM and is electrically erasable. Flash memory uses standard system voltage for erasing and programming, allowing it to be reprogrammed inside a typical system. Early flash memories had to be erased in their entirety; modern devices allow memory to be erased in blocks. Most flash memories today allow certain blocks to be protected, where the boot-up code is kept and other memory blocks on the device can be updated. Such form of flash is commonly known as Boot-block flash. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 24 Read Only Memories (ROM)
  • 25. I/O DEVICES October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 25 Timers and Counters: Timers and counters are distinguished largely on the basis of their usage, not their logic. Both are built from adder logic with registers to hold the current value, with an increment input that adds one to the current register value. However, a Timer has its count connected to a periodic clock signal to measure time intervals, while a Counter has its count input connected to an aperiodic signal in order to count the number of occurrences of some external event. Because the same logic can be used for either purpose, the device is often called a Counter/Timer.
  • 26. The adjacent Figure shows enough of the internals of a Counter/Timer to illustrate its operation. An n-bit counter/timer uses an n-bit register to store the current state of the count and an array of half subtractors to decrement the count when the count signal is asserted. Combinational logic checks when the count equals zero; the done output signals the zero count. It is often useful to be able to control the time-out, rather than require exactly 2n events to occur. For this purpose, a reset register provides the value with which the count register is to be loaded. The Counter/Timer provides logic to load the reset register. Most counters provide both cyclic and acyclic modes of operation. In the cyclic mode, once the counter reaches the done state, it is automatically reloaded and the counting process continues. In acyclic mode, the counter/timer waits for an explicit signal from the microprocessor to resume counting. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 26 Internals of a Counter/Timer I/O DEVICES cont’d….
  • 27. A Watchdog Timer is an I/O device that is used for internal operation of a system. As shown in Figure, the Watchdog Timer is connected into the CPU bus and also to the CPU’s reset line. The CPU’s software is designed to periodically reset the watchdog timer, before the timer ever reaches its time- out limit. If the watchdog timer ever does reach that limit, its time-out action is to reset the processor. In that case, the presumption is that either a Software Flaw or Hardware Problem has caused the CPU to misbehave. Rather than diagnosing the problem, the system is reset to get it operational as quickly as possible. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 27 I/O DEVICES cont’d…. A Watchdog Timer
  • 28. A/D and D/A Converters ANALOG/DIGITAL (A/D) and Digital/Analog (D/A) converters (typically known as ADCs and DACs, respectively) are often used to interface non digital devices to embedded systems. Because A/D conversion requires more complex circuitry, it requires a somewhat more complex interface. Analog/digital conversion requires sampling the analog input before converting it to digital form. A control signal causes the A/D converter to take a sample and digitize it. A typical A/D interface has, in addition to its analog inputs, two major digital inputs. A Data Port allows A/D registers to be read and written, and a Clock Input tells when to start the next conversion. D/A conversion is relatively simple, so the D/A converter interface generally includes only the data value. The input value is continuously converted to analog form. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 28
  • 29. Keyboards A keyboard is basically an array of switches, but it may include some internal logic to help simplify the interface to the microprocessor. A switch uses a mechanical contact to make or break an electrical circuit. The major problem with mechanical switches is that they bounce as shown in Figure below. When the switch is depressed by pressing on the button attached to the switch’s arm, the force of the depression causes the contacts to bounce several times until they settle down. If this is not corrected, it will appear that the switch has been pressed several times, giving false inputs. A hardware debouncing circuit can be built using a one-shot timer. Software can also be used to debounce switch inputs. A raw keyboard can be assembled from several switches. Each switch in a raw keyboard has its own pair of terminals, making raw keyboards impractical when a large number of keys is required. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 29 Switch Bouncing
  • 30. More expensive keyboards, such as those used in PCs, actually contain a microprocessor to preprocess button inputs. PC keyboards typically use a 4-bit microprocessor to provide the interface between the keys and the computer. The microprocessor can provide debouncing, but it also provides other functions as well. An encoded keyboard uses some code to represent which switch is currently being depressed. At the heart of the encoded keyboard is the scanned array of switches shown in adjacent Figure. Unlike a raw keyboard, the scanned keyboard array reads only one row of switches at a time. The demultiplexer at the left side of the array selects the row to be read. When the scan input is 1, that value is transmitted to one terminal of each key in the row. If the switch is depressed, the 1 is sensed at that switch’s column. Since only one switch in the column is activated, that value uniquely identifies a key. The row address and column output can be used for encoding, or circuitry can be used to give a different encoding. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 30 A Scanned Key Array Keyboards cont’d….
  • 31. There are 2 problems associated with encoding the keyboard listed as follows: 1. Combinations of keys may not be represented. e.g.: On a PC keyboard, the encoding must be chosen so that combinations such as control-Q can be recognized and sent to the PC. 2. Rollover may not be allowed. e.g.: if “a” is pressed and then “b” is pressed before releasing “a,” in most applications there is need to send an “a” followed by a “b” through the keyboard. Rollover is very common in typing at even modest rates. A naive implementation of the encoder circuitry will simply throw away any character depressed after the first one until all the keys are released. The keyboard microcontroller can be programmed to provide n-key rollover, so that rollover keys are sensed, put on a stack, and transmitted in sequence as keys are released. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 31 Keyboards cont’d….
  • 32. Light Emitting Diodes (LEDs) LED’s are often used as simple displays by themselves, and arrays of LEDs may form the basis of more complex displays. Figure below shows how to connect an LED to a digital output. A resistor is connected between the output pin and the LED to absorb the voltage difference between the digital output voltage and the 0.7 V drop across the LED. When the digital output goes to 0, the LED voltage is in the device’s off region and the LED is not on. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 32 An LED connected to a digital output
  • 33. Displays A display device may be either directly driven or driven from a frame buffer. The displays with a small number of elements are driven directly by logic, while large displays use a RAM frame buffer. The n-digit array, shown in adjacent Figure, is a simple example of a display that is usually directly driven. A single-digit display typically consists of seven segments; each segment may be either an LED or a Liquid Crystal Display (LCD) element. This display relies on the digits being visible for some time after the drive to the digit is removed, which is true for both LEDs and LCDs. The digit input is used to choose which digit is currently being updated, and the selected digit activates its display elements based on the current data value. The display’s driver is responsible for repeatedly scanning through the digits and presenting the current value of each to the display. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 33 An n-digit Display
  • 34. A Frame Buffer is a RAM that is attached to the system bus. The microprocessor writes values into the frame buffer in whatever order is desired. The pixels in the frame buffer are generally written to the display in raster order by reading pixels sequentially. Many large displays are built using LCD. Each pixel in the display is formed by a single liquid crystal. LCD displays present a very different interface to the system because the array of pixel LCDs can be randomly accessed. Modern LCD panels use an active matrix system that puts a transistor at each pixel to control access to the LCD. Early LCD panels were called passive matrix because they relied on a two-dimensional grid of wires to address the pixels. Active matrix displays provide higher contrast and a higher-quality display. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 34 Displays cont’d….
  • 35. Touchscreens A Touchscreen is an input device overlaid on an output device. The Touchscreen registers the position of a touch to its surface. By overlaying this on a display, the user can react to information shown on the display. The 2 most common types of touchscreens are Resistive and Capacitive. Resistive Touchscreen: It uses a 2D voltmeter to sense position. As shown in Figure below, the touchscreen consists of two conductive sheets separated by spacer balls. The top conductive sheet is flexible so that it can be pressed to touch the bottom sheet. A voltage is applied across the sheet; its resistance causes a voltage gradient to appear across the sheet. The top sheet samples the conductive sheet’s applied voltage at the contact point. An Analog/Digital Converter is used to measure the voltage and resulting position. The touchscreen alternates between x and y position sensing by alternately applying horizontal and vertical voltage gradients. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 35 Cross section of a Resistive Touchscreen
  • 36. COMPONENT INTERFACING Memory Interfacing: The memory structure will be simple, if a memory is bought which is of the exact size that is needed. If more memory is needed than that can be bought in a single chip, then several such memory chips are needed to construct the memory of required size. e.g. if 4GB Memory is needed and the single memory chip is available in 2GB then 2 Memory chips are needed. To build a memory that is wider than the one that can bought on a single chip. e.g. A 32-bit-wide memory chip cannot be bought generally, a memory of a given width can easily be constructed (32 bits, 64 bits, etc.) by placing RAMs in parallel. Also LOGIC may be needed to turn the Bus Signals into the appropriate memory signals. So appropriate refresh signals need to be generated. e.g. Most busses won’t send address signals in row and column form. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 36
  • 37. Device Interfacing: Some I/O devices are designed to interface directly to a particular bus, forming glue-less interfaces. But glue logic is required when a device is connected to a bus for which it is not designed. An I/O device typically requires a much smaller range of addresses than a memory, so addresses must be decoded much more accurately. Some additional logic is required to cause the bus to read and write the device’s registers. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 37 COMPONENT INTERFACING cont’d….
  • 38. The device has four registers that can be read and written by presenting the register number on the regid pins, asserting R/W as required, and reading or writing the value on the regval pins. To interface to the bus, the bottom two bits of the address are used to refer to registers within the device, and the remaining bits are used to identify the device itself. The top bits of the address are sent to a comparator for testing against the device address. The device’s address can be set with switches to allow the address to be easily changed. When the bus address matches the device’s, the result is used to enable a transceiver for the data pins. When the transceiver is disabled, the regval pins are disconnected from the data bus. The comparator’s output is also used to modify the R/W signal: The device’s R/W pin is given the value (bus R/W + not-equal address), so that when the comparator’s result is not 1, the device’s R/W pin always receives a 1 to avoid inadvertently writing the device registers. A glue logic interface: Below is an interfacing scheme for a simple I/O device October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 38 COMPONENT INTERFACING cont’d….
  • 39. System Architecture: An Architecture is a set of elements and the relationships between them that together form a single unit. The architecture of an embedded computing system is the blueprint for implementing that system it gives an information about the components needed and how they are put together. It includes both hardware and software elements. It includes several elements, some of which may be less obvious than others. ■ CPU An embedded computing system clearly contains a microprocessor. There are many different architectures, and even within an architecture there are models that vary in clock speed, bus data width, integrated peripherals, and so on. The choice of the CPU is one of the most important, also the software that will execute on the machine. ■ Bus The choice of a bus is closely tied to that of a CPU, since the bus is an integral part of the microprocessor. But in applications that make intensive use of the bus due to I/O or other data traffic, the bus may be more of a limiting factor than the CPU. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 39 DESIGNING WITH MICROPROCESSORS
  • 40. System Architecture cont’d…. ■ Memory The most obvious characteristic of the memory is its total size, which depends on both the required data volume and the size of the program instructions. The ratio of ROM to RAM and selection of DRAM versus SRAM can have a significant influence on the cost of the system. The speed of the memory plays a great role in determining system performance. ■ Input and Output devices: For a given function, there may be several different devices of varying sophistication and cost that can do the job for the CPU. These devices are called the I/O devices based on fact whether such device is being used for input or output operation. e.g. A set of switches and knobs on a front panel may all be controlled by a single microcontroller, which is in turn connected to the main CPU. The difficulty of using a particular device, such as the amount of glue logic required to interface it, may also play a role in final device selection. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 40
  • 41. Hardware Design Step – 1: Consider evaluation boards supplied by the microprocessor manufacturer or another company working in collaboration with the manufacturer. Evaluation boards are sold for many microprocessor systems; they typically include the CPU, some memory, a serial link for downloading programs, and some minimal number of I/O devices. Figure below shows an ARM evaluation board manufactured by Sharp. The evaluation board may be a Complete Solution or provide what is needed with only slight modifications. If the evaluation board is supplied by the microprocessor vendor, its design may be available from the vendor; If the evaluation board comes from a third party, it may be possible to contract them to design a new board with the required modifications, or start from scratch on a new board design. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 41
  • 42. Step-II: The other major task is the choice of memory and peripheral components. In the case of I/O devices, there are two alternatives for each device: selecting a component from a catalog or designing from scratch. When shopping for devices from a catalog, it is important to read data sheets carefully; it may not be trivial to figure out whether the device does what it is intended for. Also due consideration must be given to the amount of glue logic required to connect the device to the bus. Simple peripheral logic can be implemented in Programmable Logic Devices (PLDs), while more complex units can be built from Field-programmable Gate Arrays (FPGAs). October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 42 Hardware Design cont’d….
  • 43. The PC as a Platform Personal computers are often used as platforms for embedded computing. Advantages of a PC: it is a predesigned hardware platform with a great many features, a wide variety of I/O devices can be attached to it, and it provides a rich programming environment. Disadvantage: PC is larger, more power hungry, and more expensive than a custom hardware platform would be. However, for low-volume applications and environments such as factories and offices where size and power are not critical, using a PC to build an embedded system often makes a lot of sense. As shown in adjacent Figure, a typical PC includes several major hardware components: ■ The CPU provides basic computational facilities. ■ RAM is used for program storage. ■ ROM holds the boot program. ■ A DMA controller provides DMA capabilities. ■ Timers are used by the operating system for a variety of purposes. ■ A High-speed Bus, connected to the CPU bus through a bridge, allows fast devices to communicate efficiently with the rest of the system. ■ A Low-speed Bus provides an inexpensive way to connect simpler devices and may be necessary for backward compatibility as well. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 43 Hardware architecture of a typical PC
  • 44. PCI (Peripheral Component Interconnect) PCI is the High-performance system bus which uses High-speed data transmission techniques and efficient protocols to achieve high throughput. The original PCI standard allowed operation up to 33 MHz; at that rate, a maximum transfer rate of 264 MB/s can be achieved using 64-bit transfers. The revised PCI standard allows the bus to run up to 66 MHz, giving a maximum transfer rate of 524 MB/s with 64-bit wide transfers. PCI uses wide buses with many data and address bits along with multiple control bits. The width of the PCI bus increases both the cost of an interface to the bus and makes the physical connection to the bus more complicated. PCI also allows devices to be chained together so that users need not worry about the order of devices on the bus or other details of connection. USB (Universal Serial Bus) and IEEE 1394 are the two major high-speed serial buses. Both of these buses offer high transfer rates using simple connectors. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 44
  • 45. Basic Input / Output System (BIOS) A PC provides a standard software platform that interfaces to the underlying hardware as well as more advanced services. At the bottom of the software platform structure in most PCs is a minimal set of software in ROM. This software is designed to load the complete operating system from some other device (disk, network, etc.), and it may also provide low-level hardware interfaces. In the IBM-compatible PC, the low-level software is known as the Basic Input / Output System (BIOS). The BIOS provides low-level hardware drivers as well as booting facilities. The operating system provides high-level drivers, control of executing processes, user interfaces, and so on. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 45
  • 46. System organization of the Intel StrongARM SA-1100 and SA-1111 The StrongARM SA-1100 provides a number of functions besides the ARM CPU. The chip contains two on-chip buses: a high-speed system bus and a lower-speed peripheral bus. The chip also uses two different clocks. A 3.686 MHz clock is used to drive the CPU and high-speed peripherals, and a 32.768 kHz clock is an input to the system control module. The system control module contains the following peripheral devices: ■ A real-time clock ■ An operating system timer ■ 28 general-purpose I/Os (GPIOs) ■ An interrupt controller ■ A power manager controller ■ A reset controller that handles resetting the processor. The 32.768 kHz clock’s frequency is chosen to be useful in timing real-time events. The slower clock is also used by the power manager to provide continued operation of the manager at a lower clock rate and therefore lower power consumption. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 46
  • 47. DEVELOPMENT AND DEBUGGING Development Environments: A typical embedded computing system has a relatively small amount of everything, including CPU horsepower, memory, I/O devices, and so forth. As a result, it is common to do at least part of the software development on a PC or workstation known as a host as illustrated in Figure below. The hardware on which the code will finally run is known as the Target. The host and target are frequently connected by a USB link, but a higher-speed link such as Ethernet can also be used. The target must include a small amount of software to talk to the host system. That software will take up some memory, interrupt vectors, and so on, but it should generally leave the smallest possible footprint in the target to avoid interfering with the application software. The host should be able to do the following: ■ load programs into the target, ■ start and stop program execution on the target, and ■ examine memory and CPU registers. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 47 Connecting a host and a target system
  • 48. A Cross-compiler is a compiler that runs on one type of machine but generates code for another. After compilation, the executable code is downloaded to the embedded system by a serial link or perhaps burned in a PROM and plugged in. Host-target debuggers are often used, in which the basic hooks for debugging are provided by the target and a more sophisticated user interface is created by the host. A PC or workstation offers a programming environment which is much friendlier than the typical embedded computing platform. Problem with this approach emerges when debugging code talks to I/O devices, as the host will not have the same devices configured in the same way, the embedded code cannot be run as is done on the host. A Test-bench program can be built to help debug the embedded code. The Test-bench generates inputs to simulate the actions of the input devices; it may also take the output values and compare them against expected values, providing valuable early debugging help. The embedded code may need to be slightly modified to work with the Testbench, but careful coding (such as using the #ifdef directive in C) can ensure that the changes can be undone easily and without introducing bugs. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 48 DEVELOPMENT AND DEBUGGING cont’d….
  • 49. Debugging Techniques (S/W based) A Software Debugging can be done by Compiling and Executing the code on a PC or workstation. But at some point it inevitably becomes necessary to run code on the embedded hardware platform. Embedded systems are usually less friendly programming environments than PCs but, the resourceful designer has several options available for debugging the system. The serial port found on most evaluation boards is one of the most important debugging tools. It is a good idea to design a serial port into an embedded system even if it is not likely to be used in the final product; the serial port can be used not only for development debugging but also for diagnosing problems in the field. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 49
  • 50. Another very important debugging tool is the Breakpoint. The simplest form of a Breakpoint is for the user to specify an address at which the program’s execution is to break. When the PC reaches that address, control is returned to the monitor program. From the monitor program, the user can examine and/or modify CPU registers, after which execution can be continued. Implementing breakpoints does not require using exceptions or external devices. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 50 Debugging Techniques (S/W based) cont’d….
  • 51. Following Programming Example shows how to use instructions to create breakpoints. Breakpoints: A breakpoint is a location in memory at which a program stops executing and returns to the debugging tool or monitor program. Implementing breakpoints is very simple, it only requires replacement of the instruction at the breakpoint location with a subroutine call to the monitor. In the following code, to establish a breakpoint at location 0x40c in some ARM code, the branch (B) instruction is replaced and is normally held at that location with a subroutine call (BL) to the breakpoint handling routine: When the breakpoint handler is called, it saves all the registers and can then display the CPU state to the user and take commands. To continue execution, the original instruction must be replaced in the program. If the breakpoint can be erased, the original instruction can simply be replaced and control returned to that instruction. This will normally require fixing the subroutine return address, which will point to the instruction after the breakpoint. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 51 Debugging Techniques (S/W based) cont’d….
  • 52. When Software Tools are insufficient to debug the system, Hardware aids can be deployed to give a clearer view of what is happening when the system is running. The microprocessor In Circuit Emulator (ICE) is a specialized hardware tool that can help debug software in a working embedded system. An ICE is a special version of the microprocessor that allows its internal registers to be read out when it is stopped. The In-circuit Emulator surrounds this specialized microprocessor with additional logic that allows the user to specify breakpoints and examine and modify the CPU state. The CPU provides as much debugging functionality as a debugger within a monitor program, but does not take up any memory. Drawback of In-circuit Emulation: The machine is specific to a particular microprocessor, even down to the pinout. If several microprocessors are used, maintaining a fleet of In-circuit Emulators to match can be very expensive. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 52 Debugging Techniques (H/W based) cont’d….
  • 53. The Logic Analyzer is the other major piece of instrumentation in the embedded system designer’s arsenal. Think of a logic analyzer as an array of inexpensive oscilloscopes; the analyzer can sample many different signals simultaneously (tens to hundreds) but can display only 0, 1, or changing values for each. All these logic analysis channels can be connected to the system to record the activity on many signals simultaneously. The logic analyzer records the values on the signals into an internal memory and then displays the results on a display once the memory is full or the run is aborted. The logic analyzer can capture thousands or even millions of samples of data on all of these channels, providing a much larger time window into the operation of the machine than is possible with a conventional oscilloscope. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 53 Debugging Techniques (H/W based) cont’d….
  • 54. A typical Logic Analyzer can acquire data in either of two modes that are typically called State and Timing modes. The measurement resolution on each signal is reduced in both voltage and time dimensions. The reduced voltage resolution is accomplished by measuring logic values (0, 1, x) rather than analog voltages. The reduction in Timing resolution is accomplished by sampling the signal, rather than capturing a continuous waveform as in an analog oscilloscope. State and timing mode represent different ways of sampling the values. Timing mode uses an Internal Clock that is fast enough to take several samples per clock period in a typical system. State mode, uses the System’s own Clock to control sampling, so it samples each signal only once per clock cycle. As a result, timing mode requires more memory to store a given number of system clock cycles. On the other hand, it provides greater resolution in the signal for detecting glitches. Timing mode is typically used for glitch-oriented debugging, while state mode is used for sequentially oriented problems. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 54 Debugging Techniques (H/W based) cont’d….
  • 55. The Internal Architecture of a logic analyzer is shown in Figure below. The system’s data signals are sampled at a latch within the logic analyzer; the latch is controlled by either the system clock or the internal logic analyzer sampling clock, depending on whether the analyzer is being used in state or timing mode. Each sample is copied into a vector memory under the control of a state machine. The latch, timing circuitry, sample memory, and controller must be designed to run at high speed since several samples per system clock cycle may be required in timing mode. After the sampling is complete, an embedded microprocessor takes over to control the display of the data captured in the sample memory. Logic analyzers typically provide a number of formats for viewing data. One format is a timing diagram format. Many logic analyzers allow not only customized displays, such as giving names to signals, but also more advanced display options. For example, an inverse assembler can be used to turn vector values into microprocessor instructions. The logic analyzer does not provide access to the internal state of the components, but it does give a very good view of the externally visible signals. That information can be used for both Functional and timing debugging. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 55 Architecture of a Logic Analyzer Debugging Techniques (H/W based) cont’d….
  • 56. Debugging Challenges Logical errors in software can be hard to track down, but errors in real-time code can create problems that are even harder to diagnose. Real-time programs are required to finish their work within a certain amount of time; if they run too long, they can create very unexpected behavior. Example below demonstrates one of the problems that can arise. A timing error in real-time code: To make it easier to compare input to output and see the results of the bug, assuming that the computation produces an output equal to the input, but that a bug causes the computation to run 50% longer than its given time interval. A sample input to the program over several sample periods follows: October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 56
  • 57. If the program ran fast enough to meet its deadline, the output would simply be a time shifted copy of the input. But when the program runs over its allotted time, the output will become very different. The behavior of the A/D and D/A converters is unpredictable make some assumptions like . First, the A/D converter holds its current sample in a register until the next sample period, and the D/A converter changes its output whenever it receives a new sample. Next, a reasonable assumption about interrupt systems is that, when an interrupt is not satisfied and the device interrupts again, the device’s old value will disappear and be replaced by the new value. The basic situation that develops when the interrupt routine runs too long is something like this: 1. The A/D converter is prompted by the timer to generate a new value, saves it in the register, and requests an interrupt. 2. The interrupt handler runs too long from the last sample. 3. The A/D converter gets another sample at the next period. 4. The interrupt handler finishes its first request and then immediately responds to the second interrupt. It never sees the first sample and only gets the second one. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 57 Debugging Challenges cont’d….
  • 58. Thus, assuming that the Interrupt Handler takes 1.5 times longer than it should, here is how it would process the sample input: • Input sample  Output sample The output waveform is seriously distorted because the interrupt routine grabs the wrong samples and puts the results out at the wrong times. The exact results of missing real-time deadlines depend on the detailed characteristics of the I/O devices and the nature of the timing violation. This makes debugging real-time problems especially difficult and if a system exhibits truly unusual behavior, missed deadlines should be suspected. In-circuit emulators, logic analyzers, and even LEDs can be useful tools in checking the execution time of real-time code to determine whether it in fact meets its deadline. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 58 Debugging Challenges cont’d….
  • 59. SYSTEM-LEVEL PERFORMANCE ANALYSIS SYSTEM-LEVEL PERFORMANCE involves much more than the CPU. Though focus is on often the CPU because it processes instructions, but any part of the system can affect total system performance. More precisely, the CPU provides an upper bound on performance, but any other part of the system can slow down the CPU. Merely counting instruction execution times is not enough. Consider the simple system of Figure below. Data needs to be moved from memory to the CPU to process it. To get the data from memory to the CPU following must be done: ■ read from the memory; ■ transfer over the bus to the cache; and ■ transfer from the cache to the CPU. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 59 System level Data Flows and Performance
  • 60. The time required to transfer from the cache to the CPU is included in the instruction execution time, but the other two times are not. The most basic measure of performance is Bandwidth— the rate at which the data can be moved. The point of interest is real-time performance measured in seconds. But often the simplest way to measure performance is in units of clock cycles. However, different parts of the system will run at different clock rates. So, it has to be ensured that the right clock rate is applied to each part of the performance estimate while converting clock cycles to seconds. For simplicity, consider the bandwidth provided by only one system component, the bus. Consider an image of 320240 pixels, with each pixel composed of 3 bytes of data. This gives a grand total of 230, 400 bytes of data. If these images are video frames, then it is to be checked if one frame can be pushed through the system within the 1/30s that a frame has to be processed before the next one arrives. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 60 SYSTEM-LEVEL PERFORMANCE ANALYSIS cont’d….
  • 61. Let the bus clock period be P and the bus width be W. Putting W in units of bytes (other measures of width could be used as well). Then to write formulas for the time required to transfer N bytes of data. We will write our basic formulas in units of bus cycles T , then convert those bus cycle counts to real time t using the bus clock period P: t = TP. (4.1) As shown in Figure below, a basic bus transfer transfers a W-wide set of bytes. The data transfer itself takes D clock cycles. (Ideally, D = 1, but a memory that introduces wait states is one example of a transfer that could require D > 1 cycles.) October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 61 SYSTEM-LEVEL PERFORMANCE ANALYSIS cont’d…. Times and data volumes in a basic bus transfer
  • 62. Addresses, handshaking, and other activities constitute overhead that may occur before (O1) or after (O2) the data. For simplicity, let the overhead be summed into O = O1 + O2. This gives a total transfer time in clock cycles of: Tbasic(N) = (D + O) . N/W ………………………………. (4.2) As shown in Figure below, a burst transaction performs B transfers of W bytes each. Each of those transfers will require D clock cycles. The bus also introduces O cycles of overhead per burst. This gives Tburst(N) = (B.D + O). N / (BW) ……………………………... (4.3) Transferring data into and out of components also raises questions of bandwidth. The simplest illustration of this problem is memory. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 62 SYSTEM-LEVEL PERFORMANCE ANALYSIS cont’d…. Times and data volumes in a burst bus transfer
  • 63. A single memory chip is not solely specified by the number of bits it can hold. As shown in Figure below, memories of the same size can have different Aspect Ratios. e.g: A 64-MB memory that is 1-bit-wide will present 64 million addresses of 1-bit data. The same size memory in a 4-bit-wide format will have 16 distinct addresses and an 8-bit-wide memory will have 8 million distinct addresses. Memory chips do not come in extremely wide aspect ratios but wider memories can be built by using several chips. The memory system width may also be determined by the memory modules used. Rather than buy memory chips individually, memory as SIMMs or DIMMs may be bought. Which aspect ratio is preferable for the overall memory system depends also on the format of the data needs to be stored in the memory and the speed with which it must be accessed, giving rise to bandwidth analysis. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 63 SYSTEM-LEVEL PERFORMANCE ANALYSIS cont’d….
  • 64. if the data types do not fit naturally into the width of the memory. Let color video pixels need to be stored in the memory. A standard pixel is 38-bit color values (say red, green, blue). A 24-bit-wide memory would allow to read or write an entire pixel value in one access. An 8-bit-wide memory, in contrast, would require three accesses for the pixel. If a 32-bit-wide memory is there then there are 2 main choices: 1. One byte of each transfer could be wasted or 2. Use that byte to store unrelated data, or the pixels can be packed. In the 2nd case, the first read would get all of the first pixel and one byte of the second pixel; the second transfer would get the last two bytes of the second pixel and the first two bytes of the third pixel; and so forth. The total number of accesses A required to read E data elements of w bits each out of a memory of width W is: A = [(E/w) mod W] + 1 …………………………………. (4.4) October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 64 SYSTEM-LEVEL PERFORMANCE ANALYSIS cont’d….
  • 65. Performance bottlenecks in a bus-based system Consider a simple bus-based system: data has to be transferred between the CPU and the memory over the bus. We need to be able to read a 320 X 240 video frame into the CPU at the rate of 30 frames/s, for a total of 612,000 bytes/s. Which will be the bottleneck and limit system performance: the bus or the memory? Let’s assume that the bus has a 1-MHz clock rate (period of 10-6 sec) and is 2 bytes wide, with D = 1 and O = 3. This gives a total transfer time of Tbasic = (1 + 3).612,000/2 = 1,224,000 cycles ……………….(4.5) October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 65
  • 66. t = Tbasic · P = 1,224,000 · 1 x 10-6 = 1.224 sec ……………………………… (4.6) Since the total time to transfer one second’s worth of frames is more than 1s, the bus is not fast enough for our application. The memory provides a burst mode with B = 4 but is only 4 bits wide, giving W = 0.5. For this memory, D = 1 and O = 4. The clock period for this memory is 107 s. Then Tmem = (4 · 1 + 4).612,000/(4 x 0.5) = 2,448,000 cycles ……… (4.7) t = Tmem · P = 2,448,000 · 1 x 10-7 = 0.2448 sec ………………..(4.8) The memory requires < 1s to transfer the 30 frames that must be transmitted in 1s, so it is fast enough. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 66 Performance bottlenecks in a bus-based system
  • 67. Parallelism When different components of the system operate in parallel, more work can be done in a given amount of time. Direct Memory Access is a prime example of parallelism, DMA was designed to off-load memory transfers from the CPU. The CPU can do other useful work while the DMA transfer is running. Figure below shows the paths of data transfers without and with DMA when transferring from memory to a device. Without DMA, the data must go through the CPU; the CPU cannot do useful work at that time. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 67 DMA transfers and parallelism
  • 68. The CPU is tied up for the amount of time required for the bus transfer. Since buses often operate at slower clock rates than the CPU, that time can be considerable. The system performance can be increased significantly by overlapping operations on the different units of the system. The timing diagrams of adjacent Figure shows timing diagram for two versions of a computation. The top timing diagram shows activity in the system when the CPU first performs some setup operations, then waits for the bus transfer to complete, then resumes its work. In the bottom timing diagram, the program on the CPU has been rewritten so that its main work is broken into two sections. In this case, once the first transfer is done, the CPU can start working on that data. Meanwhile, due to DMA, the second transfer happens on the bus at the same time. Once that data arrives and the first calculation is finished, the CPU can go on to the second part of the computation. The result is that the entire computation finishes considerably earlier than in the sequential case. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 68 Sequential and parallel schedules in a bus-based system Parallelism cont’d….
  • 69. Design Example : ALARM CLOCK Requirements: the adjacent Figure illustrates the front panel design for the alarm clock. The time is shown as four digits in 12-h format; a light has been used to distinguish between AM and PM. Several buttons are used to set the clock time and alarm time. When the hour and minute buttons are pressed, the hour and minute is advanced, respectively, by one. When setting the time, the set time button must be held down while the hour and minute buttons are hit; the set alarm button works in a similar fashion. With the alarm on and alarm off buttons, the alarm is turned on and off. When the alarm is activated, the alarm ready light is on. A separate speaker provides the audible alarm. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 69 Front panel of the alarm clock
  • 70. The Requirements Table: October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 70 Design Example : ALARM CLOCK cont’d….
  • 71. The adjacent Figure 1 shows the basic classes for the alarm clock. Calling the class that handles the basic clock operation the Mechanism class (based on a term from mechanical watches). Three classes are there representing physical elements: Lights* for all the digits and lights, Buttons* for all the buttons, and Speaker* for the sound output. The Buttons* class can easily be used directly by Mechanism. The physical display must be scanned to generate the digits output, so the Display class is introduced to abstract the physical lights. The details of the low-level user interface classes are shown in Figure 2. The Buzzer* class allows the buzzer to be turned off; analog electronics will be used to generate the buzz tone for the speaker. The Buttons* class provides read-only access to the current state of the buttons. The Lights* class allows to drive the lights. For saving the pins on the display, Lights* provides signals for only one Digit, along with a set of signals to indicate which digit is currently being addressed. Class diagram for the alarm clock October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 71 Details of low-level class for the alarm clock Design Example : ALARM CLOCK cont’d…. Figure 2 Figure 1
  • 72. Specification The display is generated by scanning the digits periodically, this function is performed by the Display class, which makes the display appear as an un-scanned, continuous display to the rest of the system. The Mechanism class is described in Figure below. This class keeps track of the current time, the current alarm time, whether the alarm has been turned on, and whether it is currently buzzing. The clock shows the time only to the minute, but it keeps internal time to the second. The time is kept as discrete digits rather than a single integer to simplify transferring the time to the display. The class provides two behaviors, both of which run continuously. I. Scan-keyboard is responsible for looking at the inputs and updating the alarm and other functions as requested by the user. II. Update-time keeps the current time accurate. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 72 The Mechanism Class
  • 73. Adjacent Figure shows the state diagram for update-time. This behavior is straightforward, but it must do several things. It is activated once per second and must update the seconds clock. If it has counted 60 s, it must then update the displayed time; when it does so, it must roll over between digits and keep track of AM-to-PM and PM-to- AM transitions. It sends the updated time to the display object. It also compares the time with the alarm setting and sets the alarm buzzing under proper conditions. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 73 Specification cont’d…. State diagram for update-time
  • 74. The state diagram for scan-keyboard is shown in adjacent Figure . This function is called periodically, frequently enough so that all the user’s button presses are caught by the system. Because the keyboard will be scanned several times per second and the same button press need not be registered several times. e.g.: the minutes count is advanced on every keyboard scan when the set-time and minutes buttons were pressed, the time would be advanced much too fast. To make the buttons respond more reasonably, the function computes button activations; it compares the current state of the button to the button’s value on the last scan, and it considers the button activated only when it is on for this scan but was off for the last scan. Once computing the activation values for all the buttons, it looks at the activation combinations and takes the appropriate actions. Before exiting, it saves the current button values for computing activations the next time this behavior is executed. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 74 State diagram for scan-keyboard Specification cont’d….
  • 75. The system has both Periodic and Aperiodic components; the current time must obviously be updated periodically, and the button commands occur occasionally. The following 2 major software components can be present in the Architecture: ■ An Interrupt-driven Routine can update the current time. The current time will be kept in a variable in memory. A timer can be used to interrupt periodically and update the time. The display must be sent the new value when the minute value changes. This routine can also maintain the PM indicator. ■ A Foreground Program can poll the buttons and execute their commands. Since buttons are changed at a relatively slow rate, it makes no sense to add the hardware required to connect the buttons to interrupts. Instead, the foreground program reads the button values and then use simple conditional tests to implement the commands, including setting the current time, setting the alarm, and turning off the alarm. Another routine called by the foreground program will turn the buzzer on and off based on the alarm time. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 75 System Architecture
  • 76. The Foreground Code will be implemented as a while loop: while (TRUE) { read_buttons(button_values);/* read inputs */ process_command(button_values);/* do commands */ check_alarm();/* decide whether to turn on the alarm */ } The loop first reads the buttons using read_buttons(). In addition to reading the current button values from the input device, this routine must preprocess the button values so that the user interface code will respond properly. As shown in Figure below, this can be done by performing a simple edge detection on the button input, the button event value is 1 for one sample period when the button is depressed and then goes back to 0 and does not return to 1 until the button is depressed and then released. This can be accomplished by a simple two-state machine. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 76 System Architecture cont’d…. Preprocessing button inputs
  • 77. The process_command() function is responsible for responding to button events. The function checks the current time against the alarm time and decides when to turn on the buzzer. This check_alarm() routine is kept separate from the Command Processing Code since the alarm must go on when the proper time is reached, independent of the button inputs. From the software architecture it can be seen that a timer needs to be connected to the CPU. Also a logic to connect the buttons to the CPU bus will be needed. Finally, before starting to write code and build hardware, draw the State Transition Graph for the clock’s commands. That diagram will be used to guide the implementation of the software components. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 77 System Architecture cont’d….
  • 78. Component Design and Testing The 2 major software components, the Interrupt Handler and the Foreground Code, can be implemented relatively straightforwardly. As the functionality of the Interrupt Handler is in the interruption process itself, that code is best tested on the Microprocessor Platform. The Foreground Code can be more easily tested on the PC or workstation used for code development. A testbench can be created for this code which generates button depressions to exercise the state machine. the advancement of the system clock also needs to be simulated. A better testing strategy for Interrupt Handler is to add testing code that updates the clock, perhaps once per four iterations of the foreground while loop. The Timer taken care this way, the focus can thus be on implementing logic to interface to the buttons, display, and buzzer. The buttons will require debouncing logic. The display will require a register to hold the current display value in order to drive the display elements. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 78
  • 79. System Integration and Testing Because this system has a small number of components, system integration is relatively easy. The software must be checked to ensure that debugging code has been turned off. Three types of Tests can be performed. 1. The clock’s accuracy can be checked against a reference clock. 2. The commands can be exercised from the buttons. 3. The buzzer’s functionality should be verified. October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 79
  • 80. THANK YOU October 6, 2014 ECS Lecture Notes VII Sem CSE (VTU), 3rd Unit: By Dr. K Satyanarayan Reddy 80