• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Chapter 3
 

Chapter 3

on

  • 1,069 views

 

Statistics

Views

Total Views
1,069
Views on SlideShare
1,069
Embed Views
0

Actions

Likes
0
Downloads
26
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Chapter 3 Chapter 3 Presentation Transcript

    • The embedded computing platform
    • The CPU Bus
      The bus is the mechanism by which the CPU communicates with memory and devices.
      A bus is, at a minimum, a collection of wires, but the bus also defines a protocol by which the CPU, memory and devices communicate.
      One of the major role of the bus is to provide an interface to memory.
    • Bus Protocols
      Bus protocol determines how devices communicate.
      Devices on the bus go through sequences of states.
      Protocols are specified by state machines, one state machine per actor in the protocol.
      May contain asynchronous logic behavior.
    • Four-cycle handshake
      Device1 raises its o/p to signal an enquiry, which tells device2 that it should get ready to listen for data.
      When device2 is ready to receive, it raises its o/p to signal an acknowledgement. At this point, device1 and 2 can transmit or receive.
      Once the data transfer is complete, device2 lowers its o/p, signaling that it has received the data.
      After seeing that ack has been released, device1 lowers its o/p.
    • Four-cycle handshake
      device 1
      enq
      device 1
      device 2
      ack
      device 2
      1
      2
      3
      4
    • Microprocessor busses
      Clock provides synchronization to the bus components,
      R/W’ is true when the bus is reading and false when the bus is writing,
      Address is an a bit bundle of signals that transmits the address for an access,
      Data is an n bit bundle of signals that can carry data to or from the CPU,
      Data ready’ signals when the values on the data bundle are valid
    • A typical microprocessor bus
    • Timing diagrams
      A timing diagram shows how the signals on a bus vary over time,
      since values like the address and data can take on many values, some standard notation is used to describe signals.
      A signal can go between 0/1 state and a stable/changing state.
      To be sure that signals go to their proper values at the proper time, timing diagram sometimes show timing constraints.
    • Timing diagrams
    • Timing diagram for the example bus
      Timing diagram shown with timing constraints for the example bus.
      The diagram shows a read and a write.
      Timing constraints shown only for read operation, but similar constraints applies to the write operation.
      The bus is normally in read mode, since that does not change any state.
      During a read the external device or memory is sending a value on the data lines, while during a write the CPU is controlling the data lines.
    • Bus read and write
    • Read Operation on timing diagram
      A read or write is initiated by setting address enable high after the clock starts to rise. We set R/W’=1 to indicate a read and the address lines are set to the desired address.
      One clock cycle later, the memory or device is expected to assert the data value at that address on the data lines. simultaneously, the external device specifies that the data are valid by pulling down the data ready’ line. This line is active low, meaning that a logically true value is indicated by a low voltage, in order to provide increased immunity to electrical noise.
      The CPU is free to remove the address at the end of clock cycle and must do so before the beginning of the next cycle. The external device has a similar requirement for removing the data value from the data lines.
    • Bus wait state
    • Burst read
      The handshake that tells the CPU and devices when data are to be transferred is formed by data ready for the acknowledge side, but Is implicit for the inquiry side.
      The data ready signal allows the bus to be connected to devices that are slower than bus.
      The cycle between the minimum time at which data can be asserted and when it is actually inserted are known as wait states.
      In this burst read transaction the CPU sends one address but receives of data values.
    • Bus burst read
    • State diagrams for bus read
      Get data
      Senddata
      Done
      Release ack
      See ack
      Ack
      Adrs
      Adrs
      Wait
      Wait
      device
      CPU
      start
    • State diagram
      The state machine view of the bus transaction is also helpful and useful complement to the timing diagram.
      It shows the transition of control signal.
      And the CPU decides to perform a read transaction, it moves to a new state, sending bus signals that cause the device to behave appropriately.
      The device’s state transition graph captures it side of the protocol.
    • Bus multiplexing
      device
      data enable
      CPU
      data
      adrs
      adrs
      Adrs enable
    • Bus multiplexing
      Some buses use multiplexed address and data.
      Additional control lines are provided to tell whether the value on the address/data lines is an address or data.
      Typically, the address comes first followed by the data.
      The address can be held in a register until the data arrive so that both can be presented to the device at the same time.
    • DMA
      Direct memory access (DMA) performs data transfers without executing instructions.
      CPU sets up transfer.
      DMA engine fetches, writes.
      DMA controller is a separate unit.
    • Bus mastership
      By default, CPU is bus master and initiates transfers.
      DMA must become bus master to perform its work.
      CPU can’t use bus while DMA operates.
      Bus mastership protocol:
      Bus request.
      Bus grant.
    • DMA operation
      CPU sets DMA registers for start address, length.
      DMA status register controls the unit.
      Once DMA is bus master, it transfers automatically.
      May run continuously until complete.
      May use every nth bus cycle.
    • Bus transfer sequence diagram
    • System bus configurations
      Multiple busses allow parallelism:
      Slow devices on one bus.
      Fast devices on separate bus.
      A bridge connects two busses.
      CPU
      slow device
      bridge
      memory
      slow device
      high-speed
      device
    • Bridge state diagram
    • ARM AMBA bus
      Two varieties:
      AHB is high-performance.
      APB is lower-speed, lower cost.
      AHB supports pipelining, burst transfers, split transactions, multiple bus masters.
      All devices are slaves on APB.
    • Memory Devices
      Several different types of memory:
      Read Only Memories
      Flash.
      Read/Write Memories
      DRAM.
      SRAM.
      Each type of memory comes in varying:
      Capacities.
      Widths.
    • Memory Device Organization
      4-Mbit memory may be
      1M x 4-bit array – single memory access obtain 4-bit data item, with maximum of 2^20 different addresses.
      4M x 1-bit array – single memory access obtain 1-bit data item, with maximum of 2^22 different addresses.
      • The height width ratio of memory is known as its aspect ratio.
      • The data are stored in 2-D array of memory cells.
      • n-bit (n = r + c) address
      • A row address
      • A column address
    • Internal Organization of a Memory Devices
    • Random-access memory
      Dynamic RAM is dense, requires refresh.
      Synchronous DRAM is dominant type.
      SDRAM uses clock to improve performance, pipeline memory accesses.
      Static RAM is faster, less dense, consumes more power.
    • Static RAM and its operation
      CE is the chip enable input. It is active low. When CE=1 the SRAM’s data pins are disabled, and when CE=0, the data pins are enabled.
      R/W controls whether the current operation is a read (R/W=1) or a write (R/W=0). Read and write are normally specified relative to the CPU, so read means reading from RAM and write means writing to RAM.
      Adrs specifies the address for the read or write.
      Data is a bidirectional bundle of signals for data transfer. When R/W=1, the pins are o/p, and when R/W=0, the data pins are input.
    • Timing diagram
      Interface
    • A read operation on the SRAM occurs as follows:
      CE is set to zero enabling the chip with R/W=1.
      An address is presented on the address lines.
      After some delay, data appear on the data lines.
      A write operation is similar:
      CE is set to zero.
      R/W is set to 0 for writing.
      An address is set on the address line and data is set on the data lines.
    • Timing diagram
      Interface
    • Timing diagram for Read
      First, RAS is set to 0 and the row part of the address is set on the address lines.
      Next, CAS is set to 0 and the column part of the address are put on the address lines.
    • Read-only memory
      ROM may be programmed at factory.
      Flash is dominant form of field-programmable ROM.
      Electrically erasable, must be block erased.
      Random access, but write/erase is much slower than read.
      NOR flash is more flexible.
      NAND flash is more dense.
    • Flash memory
      Non-volatile memory.
      Flash can be programmed in-circuit.
      Random access for read.
      To write:
      Erase a block to 1.
      Write bits to 0.
    • Flash writing
      Write is much slower than read.
      1.6 ms write, 70 ns read.
      Blocks are large (approx. 1 Mb).
      Writing causes wear that eventually destroys the device.
      Modern lifetime approx. 1 million writes.
    • Types of flash
      NOR:
      Word-accessible read.
      Erase by blocks.
      NAND:
      Read by pages (512-4K bytes).
      Erase by blocks.
      NAND is cheaper, has faster erase, sequential access times.
    • I/O Devices
      • I/O devices are commonly used in embedded computing systems.
      • Some devices are often found as on-chip devices in microcontrollers.
      • Other devices are interfaced externally.
      • We need to understand the requirements of devices interfacing and its uses in programming.
    • Timers and counters
      Very similar:
      a timer is incremented by a periodic signal;
      a counter is incremented by an asynchronous, occasional signal.
      Rollover causes interrupt.
    • Watchdog timer
      Watchdog timer is periodically reset by system timer.
      If watchdog is not reset, it generates an interrupt to reset the host.
      host CPU
      interrupt
      watchdog
      timer
      reset
    • Digital-to-analog conversion
      Use resistor tree:
      R
      Vout
      bn
      2R
      bn-1
      4R
      bn-2
      8R
      bn-3
    • Flash A/D conversion
      N-bit result requires 2n comparators:
      encoder
      Vin
      ...
    • Dual-slope conversion
      Use counter to time required to charge/discharge capacitor.
      Charging, then discharging eliminates non-linearities.
      Vin
      timer
    • Sample-and-hold
      Samples data:
      converter
      Vin
    • Switch debouncing
      A switch must be debounced to multiple contacts caused by eliminate mechanical bouncing:
    • Encoded keyboard
      An array of switches is read by an encoder.
      N-key rollover remembers multiple key depressions.
      row
    • LED
      Must use resistor to limit current:
    • 7-segment LCD display
      May use parallel or multiplexed input.
    • Types of high-resolution display
      Liquid crystal display (LCD) is dominant form.
      Plasma, OLED, etc.
      Frame buffer holds current display contents.
      Written by processor.
      Read by video.
    • Touchscreen
      Includes input and output device.
      Input device is a two-dimensional voltmeter:
    • Touchscreen position sensing
      voltage
      ADC
    • Component Interfacing : Memory interfacing
      Static RAM is simpler to interface to a bus than is DRAM, due to both the DRAM’s RAS/CAS multiplexing and the need for refresh.
      The R/W on the bus can often be directly connected to the SRAM.
      The main issue in interfacing SRAM is decoding the address.
      The chip enable pin is used in RAM’s to simplify the interfacing of large memories.
      If the required number of memory words fits within the height of an available memory, then the interface is simple: the CE signal is permanently wired to the ground so that the chip is always enabled.
    • DRAM interfacing
      The bus address can be split in to row and column address with a small amount of logic-a register captures the address, a multiplexer selects the row or column portion of the address, and a state machine generates RAS and CAS.
      The refresh signal can be generated with a counter and a state machine as shown.
      The counter times the wait between successive refresh actions, the controller generates the required signal.
      In idle state, the bus signals are passed through the DRAM to enable reads and writes.
      When the counter roles over, the controller generates CAS and then RAS to induce the next refresh cycle.
    • Device interfacing
      Some I/O devices are designed to interface directly to a particular bus, forming glueless interfaces.
      But glue logic is required when a device is required when a device is connected to a bus for which it is not designed.
      An I/O device typically requires a much smaller range of addresses than a memory, so addresses must be decoded much more finely.
      Some additional logic is required to cause the bus to read and write the device’s register.
    • System architecture
      • An architecture is a set of elements and the relationships between them that together form a single unit.
      • The architecture of an embedded computing system is the blue-print for implementing that system.
      • The architecture of an embedded computing system includes both hardware and software elements.
      Hardware:
      Hardware architecture of an embedded system is more obvious manifestation that you can touch it and feel.
      CPU:
      • There are many different architectures and even within an architecture we can select between models that vary in clock speed, integrated peripherals and so on
      • The choice of the CPU cannot be made considering the software that will execute on the machine.
    • System architecture
      Bus:
      • In applications that make intensive use of the bus due to I/O or other data traffic, the bus may be more of a limiting factor than the CPU.
      • Attention must be paid to the required data bandwidths to be sure that the bus can handle the traffic.
      Memory:
      • The ratio of ROM to RAM and selection of DRAM versus SRAM can have a significant influence on the cost of the system.
      • The speed of memory will play a large part in determining system performance
      I/O devices:
      • networking, sensors, actuators, etc.
      How big/fast much each one be?
    • Software architecture
      Functional description must be broken into pieces:
      division among people;
      conceptual organization;
      performance;
      testability;
      maintenance.
    • Hardware and software architectures
      Hardware and software are intimately related:
      software doesn’t run without hardware;
      how much hardware you need is determined by the software requirements:
      speed;
      memory.
    • Evaluation boards
      Designed by CPU manufacturer or others.
      Includes CPU, memory, some I/O devices.
      May include prototyping section.
      CPU manufacturer often gives out evaluation board netlist---can be used as starting point for your custom board design.
    • Adding logic to a board
      Programmable logic devices (PLDs) provide low/medium density logic.
      Field-programmable gate arrays (FPGAs) provide more logic and multi-level logic.
      Application-specific integrated circuits (ASICs) are manufactured for a single purpose.
    • The PC as a platform
      Advantages:
      cheap and easy to get;
      rich and familiar software environment.
      Disadvantages:
      requires a lot of hardware resources;
      not well-adapted to real-time.
    • Typical PC hardware platform
      CPU
      memory
      device
      CPU bus
      bus
      interface
      high-speed bus
      DMA
      controller
      timers
      intr
      ctrl
      low-speed bus
      bus
      interface
      device
    • Typical PC hardware platform
      The CPU provides basic computational facilities.
      RAM is used for program storage.
      ROM holds the boot program.
      A DMA controller provides DMA capabilities.
      Timers are used by the operating system for a variety of purposes.
      A high speed bus connected to the CPU bus through a bridge, allows fast devices to communicate efficiently with the rest of the system.
      A low speed bus provides an inexpensive way to connect simpler devices and may be necessary for backward compatibility as well.
    • Typical busses
      PCI: standard for high-speed interfacing
      33 or 66 MHz.
      PCI Express.
      USB (Universal Serial Bus), Firewire (IEEE 1394): relatively low-cost serial interface with high speed.
    • Software elements
      IBM PC uses BIOS (Basic I/O System) to implement low-level functions:
      boot-up;
      minimal device drivers.
      BIOS has become a generic term for the lowest-level system software.
    • Example: StrongARM
      StrongARM system includes:
      CPU chip (3.686 MHz clock)
      system control module (32.768 kHz clock).
      Real-time clock;
      operating system timer
      general-purpose I/O;
      interrupt controller;
      power manager controller;
      reset controller.
    • Strong ARM SA-1100
    • Peripheral devices of system control module:
      A real time clock.
      An operating system timer.
      28 general-purpose I/Os(GPIOs).
      An interrupt controller.
      A power manager controller.
      A reset controller that handles resetting the processor.
    • Debugging embedded systems
      Challenges:
      target system may be hard to observe;
      target may be hard to control;
      may be hard to generate realistic inputs;
      setup sequence may be complex.
    • Host/target design
      Use a host system to prepare software for target system:
      target
      system
      serial line
      host system
    • Host/target design
      • Load the programs into the target,
      • Start and stop program execution on the target, and
      • Examine memory and CPU registers.
    • Host-based tools
      Cross compiler:
      compiles code on host for target system.
      Cross debugger:
      displays target state, allows target system to be controlled.
    • Software debuggers
      A monitor program residing on the target provides basic debugger functions.
      Debugger should have a minimal footprint in memory.
      User program must be careful not to destroy debugger program, but , should be able to recover from some damage caused by user code.
    • Breakpoints
      A breakpoint allows the user to stop execution, examine system state, and change state.
      Replace the breakpointed instruction with a subroutine call to the monitor program.
    • ARM breakpoints
      0x400 MUL r4,r6,r6
      0x404 ADD r2,r2,r4
      0x408 ADD r0,r0,#1
      0x40c B loop
      uninstrumented code
      0x400 MUL r4,r6,r6
      0x404 ADD r2,r2,r4
      0x408 ADD r0,r0,#1
      0x40c BL bkpoint
      code with breakpoint
    • Breakpoint handler actions
      Save registers.
      Allow user to examine machine.
      Before returning, restore system state.
      Safest way to execute the instruction is to replace it and execute in place.
      Put another breakpoint after the replaced breakpoint to allow restoring the original breakpoint.
    • In-circuit emulators
      A microprocessor in-circuit emulator is a specially-instrumented microprocessor.
      Allows you to stop execution, examine CPU state, modify registers.
    • Logic analyzers
      A logic analyzer is an array of low-grade oscilloscopes:
    • Logic analyzer architecture
      System Data
      Samples
      UUT
      sample
      memory
      microprocessor
      vector
      address
      system clock
      controller
      state or
      timing mode
      clock
      gen
      keypad
      display
    • Hardware/software co-verification
      An instruction level simulation may be used to debug code running on the CPU.
      A cycle-level simulation tool may be used for faster simulation of parts of the system.
      A hardware/software co-simulator may be used to simulate various parts of the system at different level of detail.
    • Bus-Based Computer Systems
      Designing with microprocessors.
      Development and debugging.
      System-level performance analysis.
      Example: alarm clock
    • Design Example : Alarm clock
      Alarm on
      Alarm off
      PM
      Alarm
      ready
      light
      set
      time
      set
      alarm
      hour
      minute
      button
    • Operations
      Set time: hold set time, depress hour, minute.
      Set alarm time: hold set alarm, depress hour, minute.
      Turn alarm on/off: depress alarm on/off.
    • Alarm clock requirements
    • Alarm clock class diagram
      1
      1
      1
      1
      Lights*
      Display
      Mechanism
      1
      1
      1
      Buttons*
      Speaker*
      1
    • Alarm clock physical classes
      Lights*
      Buttons*
      Speaker*
      digit-val()
      digit-scan()
      alarm-on-light()
      PM-light()
      set-time(): boolean
      set-alarm(): boolean
      alarm-on(): boolean
      alarm-off(): boolean
      minute(): boolean
      hour(): boolean
      buzz()
    • Display class
      Display
      time[4]: integer
      alarm-indicator: boolean
      PM-indicator: boolean
      set-time()
      alarm-light-on()
      alarm-light-off()
      PM-light-on()
      PM-light-off()
    • Mechanism class
      Mechanism
      Seconds: integer
      PM: boolean
      tens-hours, ones-hours: boolean
      tens-minutes, ones-minutes: boolean
      alarm-ready: boolean
      alarm-tens-hours, alarm-ones-hours:
      boolean
      alarm-tens-minutes, alarm-ones-minutes:
      boolean
      scan-keyboard()
      update-time()
    • Update-time behavior
      update seconds
      with rollover
      display.set-time(current time)
      F
      Time >= alarm and alarm-on?
      F
      Rollover?
      T
      T
      update hh:mm
      with rollover
      alarm.buzzer(true)
      PM->AM
      AM->PM
      PM=true
      PM=false
    • Scan-keyboard behavior
      Set-time and
      not set-alarm
      and hours
      compute button activations
      Increment time
      tens w. rollover
      and AM/PM
      Alarm-on
      alarm-ready=
      true
      Alarm-off
      alarm-ready=
      false
      alarm.buzzer(false)
      Increment time
      ones w. rollover
      and AM/PM
      save button
      states
      Set-time and
      not set-alarm
      and minutes
    • System architecture
      The system has both periodic and aperiodic components-the current time must obviously be updated periodically, and the button commands occur occasionally
      It seems reasonable to have the following two major software components:
      • An interrupt driven routine can update the current time. The current time will be kept in a variable in memory. A timer can be used to interrupt periodically and update the time.
      • A foreground program can poll the buttons and execute their commands. Since buttons are changed at a relatively slow rate, it makes no sense to add the hardware required to connect the buttons to interrupts.
    • System architecture
      The foreground code will be implemented as a while loop:
      While (TRUE){
      Read buttons(button values);/*read inputs*/
      Process command(button values);/*do commands*/
      Check alarm();/*decide whether to turn on the alarm*/
      }
    • System architecture
      • The loop first reads the button using first command
      • The buttons will remain depressed for many sample periods since the sample rate is much faster than any person can push and release buttons.
      • We want to make sure that clock responds to this as a single depression of the button, not one depression per sample interval.
    • Testing
      Component testing:
      test interrupt code on the platform;
      can test foreground program using a mock-up.
      System testing:
      relatively few components to integrate;
      check clock accuracy;
      check recognition of buttons, buzzer, etc.
    • Preprocessing button inputs
      As shown in the figure this can be done using a simple edge detection