CHAPTER 4PROCESSOR TECHNOLOGYAND ARCHITECTUREDescribe CPU instruction and execution cyclesExplain how primitive CPU instructions are combined to form complex processing operationsDescribe key CPU design features, including instruction format, word size, and clock rateDescribe the function of general-purpose and special-purpose registers.Explain methods of enhancing processor performance.Describe the principles and limitations of semiconductor-based microprocessorsSummarize future processing trends.Chapter 2 gave a brief overview of computer processing, including the function of a processor,general-purpose and special-purpose processors, and the components of a central processing unit(CPU). This chapter explores CPU operation; instructions, components, and implementation (seeFigure 4.1). It also gives you an overview of future trends in processor technology andarchitecture.
CPU OPERATION:Recall from Chapter 2 that a CPU has three primary components: the control unit, the arithmeticlogic unit (ALU), and a set of registers (see Figure 4.2). The control unit moves data andinstructions between main memory and registers, and the ALU performs all computation andcomparison operations. Registers are storage locations that hold inputs and outputs for the ALU.A complex chain of events occurs when the CPU executes a program. To start, the control unitreads the first instruction from primary storage. It then stores the instruction in a register and, ifnecessary, reads data inputs from primary storage and stores them in registers. If the instructionis a computation or comparison instruction, the control unit signals the ALU what function toperform, where the input data is located, and where to store the output data. The control unithandles executing instructions to move data to memory, I/O devices, or secondary storage. Whenthe first instruction has been executed, the next instruction is read and executed. This processcontinues until the program s final instruction has been executed.FIGURE 4.1106The actions the CPU performs can be divided into two groups the fetch cycle (or instructioncycle) and the execution cycle. During the fetch cycle, data inputs are prepared for
transformation into data outputs. During the execution cycle, the transformation takes place anddata output is stored. The CPU alternates constantly between fetch and execution cycles. Figure4.3 shows the flow between fetch and execution cycles (denoted by solid arrows) and data andinstruction movement (denoted by dashed arrows).During the fetch cycle, the control unit does the following: Fetches an instruction from primary storage Increments a pointer to the location of the next instruction Separates the instruction into components the instruction code (or number) and the data inputs to the instruction Stores each component in a separate registerDuring the execution cycle, the ALU does the following: Retrieves the instruction code from a register Retrieves data inputs from registers Passes data inputs through internal circuits to perform the addition, subtraction, or other data transformation Stores the result in a registerhttp://www.scribd.com/doc/78728814/Operating-System
FIGURE 4.2 CPU components:At the conclusion of the execution cycle, a new fetch cycle is started. The control unit keepstrack of the next program instruction location by incrementing a pointer after each fetch. Thesecond program instruction is retrieved during the second fetch cycle, the third instruction isretrieved during the third fetch cycle, and so forth.
CPU REGISTERS:Registers play two primary roles in CPU operation. First, they provide a temporary storage areafor data the currently executing program needs quickly or frequently. Second, they storeinformation about the currently executing program and CPU status for example, the address ofthe next program instruction, error messages, and signals from external devices.General-Purpose Registers:General-purpose registers are used only by the currently executing program. They typicallyhold intermediate results or frequently used data values, such as loop counters or array indexes.Register accesses are fast because registers are implemented in the CPU. In contrast, storing andretrieving from primary storage is much slower. Using registers to store data needed immediatelyor frequently increases program execution speed by avoiding wait states.Adding general-purpose registers increases execution speed but only up to a point. Any processor program has a limited number of intermediate results or frequently used data items, so CPUdesigners try to find the optimal balance among the number of general purpose registers, theextent to which a typical process will use these registers, and the cost of implementing theseregisters. As the cost of producing registers has decreased, their number has increased. CurrentCPUs typically provide several dozen general-purpose registers.Special-Purpose RegistersEvery processor has special-purpose registers used by the CPU for specific purposes.Some of the more important special-purpose registers are as follows:Instruction registerInstruction pointerProgram status wordWhen the control unit fetches an instruction from memory, it stores it in the instruction register.The control unit then extracts the op code and operands from the instruction and performs anyadditional data movement operations needed to prepare for execution. The process of extractingthe op code and operands, loading data inputs, and signaling the ALU is called instructiondecoding.The instruction pointer (IP) can also be called the program counter. Recall that the CPUalternates between the instruction (fetch and decode) and execution (data movement ortransformation) cycles.At the end of each execution cycle, the control unit starts the next fetch cycle by retrieving thenext instruction from memory. This instruction saddress is stored in the instruction pointer, andthe control unit increments the instruction pointer during or immediately after each fetch cycle.The CPU deviates from sequential execution only if a BRANCH instruction is executed. ABRANCH is implemented by overwriting the instruction pointer s value with the address of theinstruction to which the BRANCH is directed. An unconditional BRANCH instruction isactually a MOVE from the branch operand, which contains the branch address, to the instructionpointer.
The program status word (PSW) contains data describing the CPU status and the currentlyexecuting program. Each bit in the PSW is a separate Boolean variable, sometimes called a flag,representing one data item. The content and meaning of flags vary widely from one CPU toanother.In general, PSW flags have three main uses:Store the result of a comparison operation.Control conditional BRANCH execution.Indicate actual or potential error conditions.The sample program shown previously in Figure 4.8 performed comparison by using the XOR,ADD, and SHIFT instructions. The result was stored in a general-purpose register andinterpreted as a Boolean value. This method was used because the instruction set was limited.Most CPUs provide one or more COMPARE instructions. COMPARE takes two operands anddetermines whether the first is less than, equal to, or greater than the second. Because there arethree possible conditions, the result can’t be stored in a single Boolean variable. Most CPUs usetwo PSW flags to store a COMPARE s result. One flag is set to true if the operands are equal,and the other flag indicates whether the first operand is greater than or less than the second. If thefirst flag is true, the second flag is ignored.To implement program branches based on a COMPARE result, two additional conditionalBRANCH instructions are provided: one based on the equality flag and the other based on theless-than or greater-than flag. Using COMPARE with related conditional BRANCH instructionssimplifies machine-language programs, which speeds up their execution.Other PSW flags represent status conditions resulting from the ALU executing instructions.Conditions such as overflow, underflow, or an attempt to perform an undefined operation(dividing by zero, for example) are represented by PSW flags. After each execution cycle, thecontrol unit tests PSW flags to determine whether an error has occurred. They can also be testedby an OS or an application program to determine appropriate error messages and correctiveactions.
BIOS updatesCD or DVD drive drivers and firmwareControllersDisplay driversKeyboard driversMouse driversModem driversMotherboard drivers and updatesNetwork card driversPrinter driversRemovable media driversScanner driversSound card driversVideo drivers Audio & Sound Drivers Bluetooth Drivers Digital Camera Drivers Fax Driver
Graphics and Video Card Keyboard / Mouse Drivers Modem Driver Monitor Drivers MP3 Player Driver Printer Drivers Scanner Drivers Webcam Drivers Wireless & Network Drivers USB DriversA typical computer system has a variety of primary and secondary storage devices. The CPU anda small amount of high-speed RAM usually occupy the same chip. Slower RAM on separatechips composes the bulk of primary storage. One or more magnetic disk drives are usuallycomplemented by an optical disc drive and at least one form of removable magnetic storage.The range of storage devices in a single computer system forms a memory-storage hierarchy, asshown in Figure 5.3. Cost and access speed generally decrease as you move down the hierarchy.Because of lower cost, capacity tends to increase as you move down the hierarchy. A computerdesigner or purchaser attempts to find an optimal mix of cost and performance for a particularpurpose.
As discussed, the critical performance characteristics of primary storage devices are access speedand data transfer unit size. Primary storage devices must closely match CPU speed and word sizeto avoid wait states. CPU and primary storage technologies have evolved in tandem in other words, CPU technology improvements are applied to the construction of primary storage devices.Storing Electrical Signals:Data is represented in the CPU as digital electrical signals, which are also the basis of datatransmission for all devices attached to the system bus. Any storage device or controller mustaccept electrical signals as input and generate electrical signals as output.Electrical power can be stored directly by various devices, including batteries and capacitors.Unfortunately, there s a tradeoff between access speed and volatility. Batteries are quite slow toaccept and regenerate electrical current. With repeated use, they also lose their capability toaccept a charge. Capacitors can charge and discharge much faster than batteries. However, smallcapacitors lose their charge fairly rapidly and require a fresh injection of electrical current atregular intervals (hundreds or thousands of times per second).An electrical signal can be stored indirectly by using its energy to alter the state of a device, suchas a mechanical switch, or a substance, such as a metal. An inverse process regenerates anequivalent electrical signal. For example, electrical current can generate a magnetic field. Themagnetic field s strength can induce a permanent magnetic charge in a nearby metalliccompound, thus writing the bit value to the metallic compound. To read the stored value, the
stored magnetic charge is used to generate an electrical signal equivalent to the one used tocreate the original magnetic charge. Magnetic polarity, which is positive or negative, canrepresent the values 0 and 1.Early computers implemented primary storage as rings of ferrous material (iron and ironcompounds), a technology called core memory. These rings, or cores, are embedded in a two-dimensional wire mesh. Electricity sent through two wires induces a magnetic charge in onemetallic ring. The charge s polarity depends on the direction of electrical flow through the wires.Modern computers use memory implemented with semiconductors. Basic types of semi conductor memory include random access memory and nonvolatile memory. There are many variationsof each memory type, described in the following sections.Random Access Memory:Random access memory (RAM) is a generic term describing primary storage devices with thefollowing characteristics:Microchip implementation with semiconductors Capability to read and write with equal speedRandom access to stored bytes, words, or larger data units RAM is fabricated in the same manneras microprocessors. You might assume that microprocessor clock rates are well matched to RAMaccess speeds.Unfortunately, this isn’t the case for many reasons, including the following:Reading and writing many bits in parallel requires additional circuitry. When RAM and microprocessors are on separate chips, there are delays when moving data from one chip to another.There are two basic RAM types and several variations of each type. Static RAM (SRAM) isimplemented entirely with transistors. The basic storage unit is a flip-flop circuit.(See Figure 5.4).Each flip-flop circuit uses two transistors to store 1 bit. Additional transistors (typically two orfour) perform read and write operations. A flip-flop circuit is an electrical switch that remembersits last position; one position represents 0 and the other represents 1. These circuits require acontinuous supply of electrical power to maintain their positions. Therefore, SRAM is volatileunless a continuous supply of power can be guaranteed.mildly destructive, resulting in storage cell destruction after 100,000 or more write operations.Because of its slower write speed and limited number of write cycles, flash RAM currently haslimited applications. It s used to store firmware programs, such as the system BIOS, that aren tchanged frequently and are loaded into memory when a computer powers up. Flash RAM is alsoused in portable secondary storage devices, such as compact flash cards in digital cameras andUSB flash drives. These storage devices typically mimic the behavior of a portable magneticdisk drive when connected to a computer system.Flash RAM is also beginning to challenge magnetic disk drives as the dominant secondarystorage technology (see Technology Focus: Solid-State Drives later in this chapter). Other NVM
technologies under development could overcome some shortcomings of flash RAM. Twopromising candidates are magnetoresistive RAM and phase-change memory.Magnetoresistive RAM (MRAM) stores bit values by using two magnetic elements, one withfixed polarity and the other with polarity that changes when a bit is written. The second magneticelement s polarity determines whether a current passing between the elements encounters low (a0 bit) or high (a 1 bit) resistance. The latest MRAM generations have read and write speedscomparable with SRAM and densities comparable with DRAM, which make MRAM a potentialreplacement for both these RAM types. In addition, MRAM doesn’t t degrade with repeatedwrites, which gives it better longevity than conventional flash RAM.Phase-change memory (PCM), also known as phase-change RAM (PRAM or PCRAM),uses a glasslike compound of germanium, antimony, and tellurium (GST). When heated tothe correct temperatures, GST can switch between amorphous and crystalline states. Theamorphous state exhibits low reflectivity (useful in rewritable optical storage media) and highelectrical resistance. The crystalline state exhibits high reflectivity and low electrical resistance.PCM has lower storage density and slower read times than conventional flash RAM, but its writetime is much faster, and it doesn t wear out as quickly.Memory PackagingMemory packaging is similar to microprocessor packaging. Memory circuits are embedded inmicrochips, and groups of chips are packed on a small circuit board that can be installed orremoved easily from a computer system. Early RAM and ROM circuits were packaged in dualinline packages (DIPs). Installing a DIP on a printed circuit board is a tedious and preciseoperation. Also, single DIPs mounted on the board surface occupy a large portion of the totalsurface area.In the late 1980s, memory manufacturers adopted the single inline memory module (SIMM) asa standard RAM package. Each SIMM incorporates multiple DIPs on a tiny printed circuit board.The edge of the circuit board has a row of electrical contacts, and the entire package is designedto lock into a SIMM slot on a motherboard. The double inline memory module (DIMM), anewer packaging standard, is essentially a SIMM with independent electrical contacts on bothsides of the module, as shown in Figure 5.5.
Current microprocessors include a small amount of on-chip memory (described in Chapter 6). Asfabrication techniques improve, the amount of memory that can be packaged with the CPU on asingle chip will grow. The logical extension of this trend is placing a CPU and all its primarystorage on a single chip, which would minimize or eliminate the current gap betweenmicroprocessor clock rates and memory access speeds.NOTEAlthough main memory isn t currently implemented as part of the CPU, the CPU s need to loadinstructions and data from memory and store processing results requires close coordinationbetween both devices. Specifically, the physical organization of memory, the organization ofprograms and data in memory, and the methods of referencing specific memory locations arecritical design issues for both primary storage devices and processors. These topics are discussedin Chapter 11.