PC Architecture- a book by Michael Karbo.This book is protected by copyright. It has been published in many European countries but never inEnglish language. Therefore I desided to upload it to the Internet. It is free to use as it is for personal,non-commercial use. All rights belong to Michael Karbo. You may not in any way copy the contents.These web pages have been produced from a Microsoft Word file. Hence the design, which could be alot beter. However, it is too much work for me to clean it all up. So youll have to live with it as it is.q PC Architecture. Preface.q Chapter 1. The PC, history and logic.q Chapter 2. The Von Neumann model.q Chapter 3. A data processor.q Chapter 4. Intro to the motherboard.q Chapter 5. It all starts with the CPU.q Chapter 6. The CPU and the motherboard.q Chapter 7. The south bridge.q Chapter 8. Inside and around the CPU.q Chapter 9. Moores Law.q Chapter 10. The cache.q Chapter 11. The L2 cache.q Chapter 12. Data and instructions.q Chapter 13. FPU’s and multimedia.q Chapter 14. Examples of CPU’s.q Chapter 15. The evolution of the Pentium 4.q Chapter 16. Choosing a CPU.q Chapter 17. The CPU’s immediate surroundings.q Chapter 18. Overclocking.q Chapter 19. Different types of RAM.q Chapter 20. RAM technologies.q Chapter 21. Advice on RAM.q Chapter 22. Chipsets and hubs.q Chapter 23. Data for the monitor.q Chapter 24. Intro to the I/O system.q Chapter 25. From ISA to PCI Express.q Chapter 26. The CPU and the motherboard.q Chapter 27. Inside and around the CPU.q Chapter 28. The cache.q Chapter 29. Data and instructions.q Chapter 30. Inside the CPU.q Chapter 31. FPU’s and multimedia.q Chapter 32. Examples of CPU’s.q Chapter 33. Choosing a CPU.q Chapter 34. The CPU’s immediate surroundings.q Chapter 35. Different types of RAM.q Chapter 36. Chipsets and hubs.q Chapter 37. Data for the monitor.q Chapter 38. The PC’s I/O system.q Chapter 39. From ISA to PCI.q Chapter 40. I/O buses using IRQ’s.q Chapter 41. Check your adapters.q Chapter 42. I/O and The south bridge.q Chapter 43. SCSI, USB and Firewire.
q Chapter 44. Hard disks, ATA and SATA.q Chapter 45. System software. A small glossary.q Please enjoy the 45 chapters, all highly illustrated ...Copyright Michael Karbo and ELI Aps., Denmark, Europe. www.karbosguide.dk
q Next chapter.q Previous chapter.PC Architecture. PrefaceCopyright Michael Karbo and ELI Aps., Denmark, Europe.ForewordWelcome to a guide which has very been exciting to write. I have spent many years learning tounderstand PC’s and how they work, and this knowledge has been my starting point. I started workingwith computers in about 1983, and in 1986 I bought my first PC – a small, cheap, British computer (an“Amstrad Joyce”), with a whopping 256 KB RAM and 140 KB diskettes to store programs on. It was aglorious little machine which I used to write a lot of teaching material. As one of the very first things Idid, I naturally tried to take the machine apart – the little bit I then dared. In 1987 I got a job whichinvolved working with real PC’s (Olivetti’s), and this gave me real opportunities to repair, assemble andinvestigate the various components of a computer.Since then, I have spent years studying the relationship between PC hardware and system software(BIOS, drivers and operating system). A subject which I found fascinating and which I still believe isimportant to understand. This lead to my first computer book, “The PC-DOS Book”, which waspublished in January 1993. Since then I have published about 45 guides, on the pc hardware andsoftware.In 1996 I again began this book. Initially I decided to collect all my articles together on the “Click andLearn” website. The material was extended and translated into both English and German. One of theadvantages of the web medium is that the author can continually update the material. And now, aftermany years, I am finally ready with the ”PC Architecture” book. I hope you like it!AssumptionsThis guide is written in easy language and contains a lot of illustrations. It should therefore not be toodifficult to understand the content. However, I am assuming that the reader already has some practicalexperience with PC’s and is familiar with the most basic computer jargon (bits, bytes, RAM, etc.). Ihave also assumed some knowledge of Windows and the various system tools. Most PC’s can easily bedismantled without needing special tools, and you should at least take the “lid” off your PC so that youcan familiarise yourself with the electronics, preferably with a torch in hand. I’m not expecting you toimmediately launch off into the complete disassembly of your PC into its individual pieces. You arewelcome to do this, but at your own risk.However, I would like to give you enough insight into and confidence about your PC’s workings thatyou would dare to upgrade your PC, for example, with a new hard disk or more RAM. And if you shouldend up building your next PC yourself, I would be more than satisfied.Structure of this guideMy explanations will shift from descriptions of the big picture to much more detailed analysis – andback again. After focusing on chips deep inside your PC, we may change perspective to more generaland holistic observations, only to dig down again into the centre of some or other small circuit. My goalhas been to keep my explanations at the popular level, and easy to understand – all the way through.As I mentioned, this guide can be a great support tool for people who simply wish to build PC’s
themselves. But my goal is more specifically to communicate a holistic understanding of the variouscomponents of a PC and its data sets, the logical system they are part of, and the technologicaldevelopments in the field. It is very useful to have this insight, whether you are a programmer, supportperson, educator, or just a “super user”.I would like to thank the companies and people who have helped me to obtain accurate, detailedinformation and photographs. It is not always easy to confirm technical data, so it is possible theremay be occasional mistakes in my presentation. Should that be the case, I hope you will bear with mein these. Try instead to appreciate the explanation of the big picture, which it has been my primarygoal to set forth!My thanks to Fujitsu-Siemens, AMD, Intel and others, for pictures and other support! I have takenmost of the photographs using my Canon G2 camera, and processed the images using the Photoshopprogram. The other graphics have been produced using Fireworks. My thanks to Ebbe, Peter, Mikkel,Carl and Jette for their support and, not least, repeated reviews of the manuscript.The book is protected by copyrightThe book has been published in many European countries but never in English language. Therefore Idesided to upload it to the Internet. It is free to use as it is for personal, non-commercial use. All rightsbelong to Michael Karbo. You may not in any way copy the contents.The design ...Thise web pages have been produced from a Microsoft Word file. Hence the design, which could be alot beter. However, it is too much work for me to clean it all up. So youll have to live with it as it is. I hope you enjoy the book. Michael B. Karbo. March 2005. www.karbosguide.dkq Next chapter.q Previous chapter.
Copyright Michael Karbo , Denmark, Europe.q Next chapter.q Previous chapter.Chapter 1. The PC, history and logicThe PC is a fascinating subject, and I want to take you on an illustrated, guided tour of its workings.But first I will tell you a bit about the background and history of computers. I will also have tointroduce certain terms and expressions, since computer science is a subject with its own terminology.Then I will start to go through the actual PC architecture!1. The historical PCThe PC is a microcomputer, according to the traditional division of computers based on size.MicrocomputersNo-one uses the expression microcomputer much anymore, but that is what the PC actually is. If welook at computers based on size, we find the PC at the bottom of the hierarchy.q Mainframes and super computers are the biggest computers – million dollar machines, as big as arefrigerator or bigger. An example is the IBM model 390.q Minicomputers are large, powerful machines which are often found at the centre of networks of“dumb” terminals and PC’s. For example, IBM’s AS/400. A definition that was used in the past, wasthat minicomputers cost between $10,000 and $100,000.q Workstations are very powerful user machines. They have the capacity to execute technical/scientificprograms and calculations, and typically use a UNIX variant or Windows NT as their operating system.Workstations used to be equipped with powerful RISC processors, like Digital Alpha, Sun Sparc orMIPS, but today workstations can be configured with one or more of Intel’s more powerful CPU’s.q The PC is the baby of the family: Small, cheap, mass-produced computers which typically runWindows and which are used for standard programs which can be purchased anywhere.The point of the story is that the baby has grown up, and has actually taken the lead! Today, PC’s areas powerful as minicomputers and mainframes were in the past. Powerful PC’s can now compete withthe much more expensive workstations. How has the PC come so far?
Fig. 1. Data processing in 1970. Digital PDP 11/20.The PC’s childhoodLet’s take a short look at the historical background of the modern PC, which originated in 1981. In lessthan 20 years, the PC went through a technological development which has surpassed everything wehave seen before. The PC has simply revolutionised society’s production and communication in justabout every sector. And the revolution appears to be set to continue for many more years.Today the PC is an industry standard. More than 90% of all microcomputers are based on Microsoft’ssoftware (Windows) and standardised hardware designed primarily by Intel. This platform or design issometimes called Wintel, a combination of the two product names.But at the time that the PC was introduced by IBM, it was just one of many 16-bit microcomputers. Forexample, the company, Digital, sold many of their “Rainbow” machines in the middle of the 1980’s,which I have worked with myself. These other machines were not IBM-compatible, but they weren’tvery different from IBM’s machines either, since they were all based on Intel’s 8088 CPU. There wereactually a number of different types of PC in the 1980’s.Fig. 2. DEC Rainbow from 1982. It costed around Euro 8.000 – then!But over just a few years, late in the 1980’s, the market got behind IBM’s standards for PCarchitecture. Using the Intel 8086 and 8088 processors and Microsoft’s operating systems (DOSinitially, later Windows), the PC revolution got seriously underway. From that time on, we talked bout
IBM-compatible PCs, and as the years passed, the PC developed to become the triumphant industrystandard.In parallel with the IBM/Intel project, Apple developed the popular Macintosh computers, which fromthe very start were very user-friendly, with a graphical user interface. The Macintosh is a completelydifferent platform from the platform of Windows-based pc’s I am describing in this guide.The Macintosh has also been released in generation after generation, but it is not compatible with IBM/Intel/Microsoft’s PC standard.Fig. 3. An almost IBM-compatible PC from 1984.In the table below you can see the development of the PC and it’s associated operating systems. ThePC was actually a further development of the 8-bit microprocessors (like the Commodore 64, etc.),which were very popular until late in the 1980’s.The computer shown in Fig. 2, is a very interesting hybrid. It marked the transition from 8 to 16-bitarchitecture. The computer contains two processors: an 8-bit Z80 and a 16-bit 8088. This enabled it torun several different operating systems, such as CP/M and MS-DOS 2. The two processors, each withtheir own bus, shared the 128 KB RAM. It was a particularly advanced machine.Fig 4. The microprocessor has entered its fourth decade.IBM and the PC’s successIf we look back at the earlier PC, there are a number of factors which have contributed to its success:q From the very beginning the PC had a standardised and open architecture.
q It was well-documented and had extensive expansion options.q The PC was cheap, simple and robust (but definitely not advanced technology)Initially, the PC was an IBM product. It was their design, built around an Intel processor (8088) andadapted to Microsoft’s simple operating system, MS-DOS.But other companies were quick to get involved. They found that they could freely copy the importantBIOS system software and the central ISA bus. None of the components were patented. That wouldn’thappen today! But precisely because of this open architecture, a whole host of companies graduallyappeared, which developed and supplied IBM-compatible PC’s and parts.ClonesIn the late 1980’s there was a lot of talk about clones. A clone is a copycat machine. A machine whichcan do exactly the same things as an original PC (from IBM), and where the individual components (e.g. the hard disk) could be identical to the original’s. The clone just has another name, or is sold withoutany name.We don’t distinguish as much today between the various PC manufacturers; but they can still bedivided into two groups:q Brand name PC’s from IBM, Compaq, Dell, Fujitsu-Siemens, etc. Companies which are large enoughto develop (potentially) their own hardware components.q Clones, which are built from standard components. Anyone can build their own clone, like the oneshown in Fig. 15 on page 10.However, the technology is basically the same for all PC’s – regardless of the manufacturer. And thiscommon technology is the subject I am going to expound.Finally, I just want to mention the term servers. They are special PC’s built to serve networks. Serverscan, in principle, be built using the same components that are used in normal PC’s. However, othermotherboards and a different type of RAM and other controllers are often used. My review willconcentrate primarily on standard PC’s.Bit widthThe very first microprocessor Intel produced (the model 4004, also discussed on page 26) was 4 bit.This meant that in a single operation, the processor could process numbers which were 4 bits long. Onecan say that the length of a machine word was 4 bits. The Intel 4004 was a 4-bit processor with a 4-bitarchitecture. Later came processors which could process 8 bits at a time, like the Intel 8008, 8080, andnot least, the Zilog Z80 (a very large number were sold). These were used in a large number of 8-bitcomputers throughout the 1970’s and well into the 1980’s.The PC (in the 1980’s) was initially a 16-bit computer. With the development of the 80386 processor,there was a change to the 32-bit architecture which we are still using today.Now there is a 64-bit architecture on the way, both from Intel (with the Itanium processor) and fromAMD (with various Athlon 64 processors). But it is still too early to predict the extent to which the 64-bit architecture will spread into normal, Windows-based PC’s. Width Processor Application 4 bit 4004 Pocket calculators
8 bit 8080 Small CP/M based home computers16 bit 8086, 8088, IBM-compatible PC’s 80286 running MS-DOS32 bit 80386 - 32 bit versions of Pentium 4 Windows (Windows 95/98/2000/XP)64 bit Athlon 64 Server software Pentium 4 64 bits versions of Itanium Windows, Linux etc.Fig. 5. Today’s PC’s use mostly 32-bit architecture.The pre-history of computersOur PC’s have “spiritual roots” going back 350 years. Mathematicians and philosophers like Pascal,Leibnitz, Babbage and Boole laid the foundations with their theoretical work.The Frenchman, Blaise Pascal,lived from 1623-1662, and was amathematical genius from a veryyoung age.As an 18-year-old, he constructeda calculating machine, and hismathematical theories have hadenormous significance to all laterscientific research.The Englishman, George Boole (1815-1864), was also a natural talent. He grew up in very humblesurroundings, and was largely self-taught. When he was 20 years old, Boole founded a mathematics school and then began to develop the symbolic logic which is currently the cornerstone of every program.Another Englishman, Charles Babbage, began developing various mechanical calculating machines in1823, which are today considered to be the theoretical forerunners of the computer. Babbage’s“analytical machine” could perform data calculations using punched cards. The machine was never fullyrealised; the plan was to power it using steam.
Fig. 6. A construction drawing for one of Babbage’s calculating machines, which consisted of several tons of brass machinery.Fig. 7. Charles Babbage (1791-1871) and his staff constructedvarious programs (software) forhis calculating machine. Babbageis therefore called the ”father ofthe computer” today.However, it was only in the 20th century that electronics advanced sufficiently to make practicalexploitation of these theories interesting. The Bulgarian John Vincent Atanasoff (1903- 1995) is the inventor of the electronic digital computer. Atanasoff was a genius. At the age of nine, he studied algebra with the help of his mother Iva Lucena Purdy, a mathe- matics schoolteacher.In the 1930’ies Atanasoff was a professor of mathematics and physics at Iowa State University in theUSA. Here he used the existing tools like the Monroe calculater and IBM tabulator for his calculations,but he found these machines too slow and inaccurate. For years he worked on the idea that thereshould better machines for calculation. His thought was to produce a digital machine, since Atanasoffhad concluded that mathematical devices fell into two classes, analog and digital. The term digital wasnot invented, so he called this class of devices “computing machines proper”
In the winter of 1939 Atanasoff was very frustrated from his lack of progress. After a long car ride(Atanasoff was fond of fast cars) he found himself drinking whisky in a bar (he was fond of scotch aswell). Suddenly he had the solution. A machine built on four principles. It should work on base-two(binary) numbers instead of base-10 and use condensers for memory. Atanasoff teamed up with abrilliant young electrician Clifford Berry and later the 700 pounds machine called Atanasoff-BerryComputer was developed. This was the first digital computer.Another pioneer was the German Konrad Zuse (1910-1995). He was only 18 when he constructed hisown mechanical binary computer called Z1.During the Second World War Zuse’s computer Z3 was used in the German aircraft industry. It was thefirst computer in the world to be programmed with software. It is interesting, that Zuse’s computerswere developed entirely independent of other contemporary scientists work.Figur 8. Konrad Zuse. One of the first scientists to produce working computers.During the war, the Germans also used an advanced code machine (Fig. 8), which the Englishexpended a great deal of effort on “hacking”. They were successful, and this contributed to laying thefoundation for the later development of computing.An interesting piece of trivia: In 1947, the American computer expert, Howard Aiken, stated that therewas only a need for six computers in the entire USA. History proved him wrong.
Fig. 9. The German ”ENIGMA” code machine.q Next chapter.q Previous chapter.
Copyright Michael Karbo and ELI Aps., Denmark, Europe.q Next chapter.q Previous chapter.Chapter 2. The Von Neumann modelThe modern microcomputer has roots going back to USA in the 1940’s. Of the many researchers, theHungarian-born mathematician, John von Neumann (1903-57), is worthy of special mention. Hedeveloped a very basic model for computers which we are still using today.Fig. 10. John von Neumann (1903-57). Progenitor of the modern, electronic PC.Von Neumann divided a computer’s hardware into 5 primary groups:q CPUq Inputq Outputq Working storageq Permanent storageThis division provided the actual foundation for the modern PC, as von Neumann was the first person toconstruct a computer which had working storage (what we today call RAM). And the amazing thing is,his model is still completely applicable today. If we apply the von Neumann model to today’s PC, itlooks like this:
Fig. 11. The Von Neumann model in the year 2004.Today we talk about multimedia PC’s, which are made up of a wealth of interesting components. Notehere that modems, sound cards and video cards, etc. all function as both input and output units. Butthis doesn’t happen simultaneously, as the model might lead you to believe. At the basic level, the vonNeumann model still applies today. All the components and terms shown in Fig. 11 are important to beaware of. The model generally helps in gaining a good understanding of the PC, so I recommend youstudy it.
Fig. 12. Cray supercomputer, 1976.In April 2002 I read that the Japanese had developed the world’s fastest computer. It is a huge thing(the size of four tennis courts), which can execute 35.6 billion mathematical operations per second.That’s five times as many as the previous record holder, a supercomputer from IBM.The report from Japan shocked the Americans, who considered themselves to be the leaders in the areof computer technology. While the American super computers are used for the development of newweapons systems, the Japanese one is to be used to simulate climate models.2. The PC’s system componentsThis chapter is going to introduce a number of the concepts which you have to know in order tounderstand the PC’s architecture. I will start with a short glossary, followed by a brief description of thecomponents which will be the subject of the rest of this guide, and which are shown in Fig. 11.The necessary conceptsI’m soon going to start throwing words around like: interface, controller and protocol. These aren’tarbitrary words. In order to understand the transport of data inside the PC we need to agree on variousjargon terms. I have explained a handful of them below. See also the glossary in the back of the guide.The concepts below are quite central. They will be explained in more detail later in the guide, but startby reading these brief explanations.Concept ExplanationBinary data Data, be it instructions, user data or something else, which has been translated into sequences of 0’s and 1’s.Bus width The size of the packet of data which is processed (e.g. moved) in each work cycle. This can be 8, 16, 32, 64, 128 or 256 bits.Band width The data transfer capacity. This is measured in, for example, kilobits/ second (Kbps) or megabytes/second (MBps).
Cache A temporary storage, a buffer.Chipset A collection of one or more controllers. Many of the motherboard’s controllers are gathered together into a chipset, which is normally made up of a north bridge and a south bridge.Controller A circuit which controls one or more hardware components. The controller is often part of the interface.Hubs This expression is often used in relation to chipset design, where the two north and south bridge controllers are called hubs in modern design.Interface A system which can transfer data from one component (or subsystem) to another. An interface connects two components (e.g. a hard disk and a motherboard). Interfaces are responsible for the exchange of data between two components. At the physical level they consist of both software and hardware elements.I/O units Components like mice, keyboards, serial and parallel ports, screens, network and other cards, along with USB, firewire and SCSI controllers, etc.Clock The rate at which data is transferred, which varies quite a lotfrequency between the various components of the PC. Usually measured in MHz.Clock tick (or A single clock tick is the smallest measure in the working cycle. Aclock cycle) working cycle (e.g. the transport of a portion of data) can be executed over a period of about 5 clock ticks (it “costs” 5 clock cycles).Logic An expression I use to refer to software built into chips and controllers. E.g. an EIDE controller has its own “logic”, and the motherboard’s BIOS is “logic”.MHz A ”speed” which is used to indicate clock frequency. It really means:(Megahertz) million cycles per second. The more MHZ, the more data operations can be performed per second.North bridge A chip on the motherboard which serves as a controller for the data traffic close to the CPU. It interfaces with the CPU through the Front Side Bus (FSB) and with the memory through the memory bus.Protocols Electronic traffic rules which regulate the flow of data between two components or systems. Protocols form part of interfaces.South bridge A chip on the motherboard which works together with the north bridge. It looks after the data traffic which is remote from the CPU (I/ O traffic).Fig. 13. These central concepts will be used again and again. See also the definitions on page PAGEREF Ordforklaringer2 h 95.
Copyright Michael Karbo and ELI Aps., Denmark, Europe.q Next chapter.q Previous chapter.Chapter 3. A data processorThe PC is a digital data processor. In practise this means that all analogue data (text, sound, pictures) gets translated intomasses of 0’s and 1’s. These numbers (binary values) exist as tiny electrical charges in microscopic circuits, where atransistor can take on two states: charged or not charged. This is one picture of a bit, which you can say is either turned onor off.There can be billions of these microscopic bits hidden inside a PC, and they are all managed using electronic circuits (EDPstands for electronic data processing). For example, the letter ”A” (like all other characters) can be represented by aparticular 8-digit bit pattern. For ”A”, this 8-digit bit pattern is 01000001.When you type an ”A” on your keyboard, you create the digital data sequence, 01000001. To put it simply, the ”A” exists asa pattern in eight transistors, where some are “turned on” (charged) and others are not. Together these 8 transistors makeup one byte.The same set of data can be stored in the video card’s electronics, in RAM or even as a magnetic pattern on your hard disk:Fig. 14. The same data can be found on the screen, on the hard disk and in RAM.The set of data can also be transferred to a printer, if you want to print out your text. The printer electronically andmechanically translates the individual bits into analogue letters and numbers which are printed on the paper. In this way,there are billions of bytes constantly circulating in your PC, while ever it is switched on. But how are these 0’s and 1’s movedaround, and which components are responsible?The physical PCThe PC is made up of a central unit (also called a system unit) and some external devices. The central unit is a box (acabinet), which contains most of the computer’s electronics (the internal devices). The external devices are connected to thecentral unit (shown below) using cables.
Fig. 15. The central unit contains the majority of a PC’s electronics.The cabinet shown in Fig. 15 is a minitower. In this cabinet, the motherboard is mounted vertically down one side. You canbuy a taller cabinet of the same type. It’s called a tower. If the cabinet is designed to be placed on a desk (under themonitor), it is called a desktop cabinet. Fig. 16. A desktop cabinet.Fig.17 shows a list of most of the components of the PC. Some of them are internal, i.e., they are inside the cabinet. Othercomponents are external, they are located outside the cabinet.Read through the list and think about what the words refer to. Do you know all these devices?
Internal devices External devices Motherboard CPU, RAM, cache, ROM circuits Keyboard containing the BIOS and startup Mouse programs. Chipsets Joystick (controllers). Ports, busses and Screen slots. EIDE interface, USB, AGP, Printer etc. Scanner Speakers External drives Drives Hard disk(s), diskette drive, CD- Tape drive ROM, DVD, etc. MIDI units Modem Digital camera Plug-in cards Graphics card (video adapter), network card, SCSI controller. Sound card, video and TV card. Modem and ISDN card.Fig. 17. The PC’s components can be divided into internal and external groups.Speed – the more we get, the more we wantThe PC processes data. It performs calculations and moves data between the various components. It all happens at ourcommand, and we want it to happen fast.It is interesting to note that current technological development is basically focusing exclusively on achieving faster dataprocessing. The entire PC revolution over the last 20 years is actually just a sequence of ever increasing speed records in thearea of data transfer. And there doesn’t seem to be any upper limit to how much data transfer speed we need.This continual speed optimisation is not just occurring in one place in the PC; it’s happening everywhere that data is moved.q The transfer from RAM to CPU – it has to be faster.q The transfer between hard disk and motherboard – it has to be faster.q Data to the screen – it has to be faster.q Etc.The PC can be viewed as a series of more or less independent subsystems, which can each be developed to permit greatercapacity and higher speed. We constantly need new standards, because of the new, faster, interfaces, busses, protocols(which we all work out together), delivering better performance.Fig. 18. Data transfer between all the components of the PC has to be fast.Interfaces hold it all together
The PC is the sum of all these subsystems. At each boundary between one subsystem and another, we find an interface. Thatis, an electrical system which connects the two subsystems together and enables them to exchange data.Fig. 19. The hardware components are connected to each other via interfaces.The concept of an interface is a little abstract, as it most accurately refers to a standard (a set of rules for the exchange ofdata). In practise, an interface can consist of, for example, two controllers (one at each end of the connection), a cable, andsome software (protocols, etc.) contained in the controllers.The controllers are small electronic circuits which control the movement of data to and from the device.Fig. 20. An interface connects two hardware devices. An interface can consist of controllers with built-in software, cables, etc.There are many interfaces in the PC, because there are many subsystems which have to be connected. Each interface isnormally tailor-made for the job, and tuned to achieve maximum bandwidth (data transfer capacity) between the twocomponents.An example of an interfaceLater in the guide I want to explore the EIDE interface in more detail, but I will use it here as a specific example of aninterface. Keep your attention focused on the concept of an interface – you may not understand all the details, that doesn’tmatter here.If we want to connect a hard disk to a motherboard, this is achieved using an EIDE interface. If we look more closely at thisinterface, it can be divided into a series of subcomponents. The interface consists of both hardware and logic: the mostimportant being the two EIDE controllers. One is integrated into the hard disk’s electronics, and the other is integrated intothe motherboard, where it forms part of the chipset’s south bridge.
Fig. 21. Underneath the hard disk you can see a small printed circuit board. This incorporates the controller functions whichwork together with the corresponding controller in the PC’s motherboard.The advantage of this system is that the hard disk can be connected directly to the motherboard with a cable. But the cablestill runs from one controller to the other.The two controllers work according to a common standard, which is the ATA standard. This standard includes a set ofprotocols which are continually being developed in new versions. Let’s say our specific hard disk can use the ATA/100protocol. That means the controller on the motherboard has to also be compatible with ATA/100, and the cable as well. Whenall that is in place, we have a working ATA interface.Fig. 22. A specific example of an interface.q Next chapter.q Previous chapter.
Copyright Michael Karbo and ELI Aps., Denmark, Europe.q Next chapter.q Previous chapter.Chapter 4. Intro to the motherboardNow let’s dive into the pc box. The whole computer is built up around a motherboard, and it is the most important component in the PC.In this chapter I will introduce the motherboard and it’s components.q Construction of the motherboard.q The CPU.q The busses.q Chipsets (controllers).I will work through the individual components in more detail later in the guide. This chapter will describe the architecture in “broader” brush strokes.Data exchange in the motherboardThe motherboard is a large printed circuit board, which has lots of chips, connectors and other electronics mounted on it. Computer nerds simply call ita board.Inside the PC, data is constantly being exchanged between or via the various devices shown in Fig. 17. Most of the data exchange takes place on themotherboard itself, where all the components are connected to each other:Fig. 23. Data exchange on the motherboard.In relation to the PC’s external devices, the motherboard functions like a central railway station.
Fig. 24. The motherboard is the hub of all data exchange.All traffic originates from or ends up in the motherboard; which is appropriately called the most important component of the PC. I will show youpictures of the individual components of the motherboard later, but this is what it looks like as a total unit:Fig. 25. A motherboard is a board covered with electronics.Find your motherboardIf you are in position to look at a motherboard, I would recommend you do so. It is a very good exercise to try to identify the various components on amotherboard.The motherboard is really just a big plastic sheet which is full of electrical conductors. The conductors (also called tracks) run across and down, and inseveral layers, in order to connect all the individual components, and transfer data between them.The motherboard is mounted in the PC box using small plastic brackets and screws. The cabinet and the motherboard are made to suit each other, sothere are holes in the metal for the connectors mounted on the board. Finally, the motherboard has to be connected to the PC’s power supply installedin the cabinet. This is done using a standard connector:
Fig. 26. The power supply is connected to the motherboard via a multicoloured cable and a large white plastic connector.Now we’ll look at the various types of components on the motherboard.ChipsThe active devices on the motherboard are gathered together in chips. These are tiny electronic circuits which are crammed with transistors. The chipshave various functions. For example, there are:q ROM chips, which store the BIOS and other programs.q CMOS storage, which contains user-defined data used by the setup program.q The chipset, which normally consists of two, so-called controllers, which incorporate a number of very essential functions.You’ll learn a lot about these chips and their functions later in the guide.SocketsYou will also find sockets on the motherboard. These are holders, which have been soldered to the motherboard. The sockets are built to exactly matcha card or a chip.This is how a number of components are directly connected to the motherboard. For example, there are sockets (slots) to mount:q The CPU and working storage (the RAM modules).q Expansion cards, also called adapters (PCI, AGP and AMR slots, etc.).The idea of a socket is, that you can install a component directly on the motherboard without needing special tools. The component has to be pushedcarefully and firmly into the socket, and will then hopefully stay there.
Fig. 27. Here you can see three (white) PCI sockets, in which plug-in cards can be installed.Plugs, connectors and ports…The motherboard also contains a number of inputs and outputs, to which various equipment can be connected. Most ports (also called I/O ports) can beseen where they end in a connector at the back of the PC. These are:q Ports for the keyboard and mouse.q Serial ports, the parallel port, and USB ports.q Sockets for speakers/microphone etc.Often, the various connectors are soldered onto the motherboard, so that the external components, like the keyboard, mouse, printer, speakers, etc.,can be connected directly to the motherboard.Fig. 28. Connectors mounted directly on a motherboard.In addition to these sockets, connectors and ports, the motherboard contains a number of other contacts. These include:q The big connector which supplies the motherboard with power from the power supply (see Fig. 26.q Other connectors for the diskette drive, hard disk, CD-ROM drive, etc.q So-called jumpers, which are used on some motherboards to configure voltage and various operating speeds, etc.q A number of pins used to connect the reset button, LED for hard disk activity, built-in speaker, etc.
Fig. 29. A connector can be an array of pins like this, which suits a special cable.Take a look at Fig. 30 and Fig. 31, which show connectors and jumpers from two different motherboards.Fig. 30. The tiny connectors and jumpers that are hidden on any motherboard.The ROM BIOS chip (Award brand), inFig. 31, contains a small collection of programs (software) which are permanently stored on the motherboard,and which are used, for example, when the PC starts up:
Fig. 31. At the bottom left, you can see the two rows of pins which connect, for example, to the little speaker inside the cabinet. On the bottom rightyou can see two “jumpers”.The round thing in Fig. 31 is the motherboard battery, which maintains the clock function and any settings saved in the CMOS storage.In a later chapter I will describe the motherboard seen through the eyes of a PC builder. But first we shall take a look at the motherboard’s architectureand the central components found on it.q Next chapter.q Previous chapter.
Copyright Michael Karbo and ELI Aps., Denmark, Europe.q Next chapter.q Previous chapter.Chapter 5. It all starts with the CPUThere are two very fundamental components to study on the motherboard. The CPU and the busses.The CPU does all the data processing, and the busses handle all data transfer.Fig. 32. The CPU is mounted on the motherboard, hidden under the cooling fan and heat sink.What is a CPU?CPU stands for Central Processing Unit. There can be several processors in a computer, but one ofthem is the central one – the CPU.The reason the CPU is called a processor is because it can work with data. And it has two importantjobs:q It can do calculations.q It can move data.The CPU is very fast at doing both jobs. The faster the CPU can do calculations and move data, thefaster we say the PC is. What follows is a short description of how to achieve faster data processing.Read it, and see if you understand all the concepts. There are three ways to improve a PC’sperformance:
q Higher clock frequencies (which means more clock ticks per second).q Greater bus width.q Optimising the core of the processor and other components so that the maximum amount of work isdone for each clock tick.All this can lead to better bandwidth, which is required throughout the PC. The entire developmentprocess is focused around the motherboard, and especially the CPU. But all of the electronics has to beable to keep up with the high pace, and that is what makes the motherboard so fascinating.The CPU is physically quite small. At its core is an electronic circuit (called a die), which is no biggerthan your little fingernail. Fig. 33. The CPU circuit (the ”die”) can be seen in the middle of the chip (An AthlonXP shown close to actual size).Despite its small size, the CPU is full of transistors. The die in a Pentium 4 CPU contains 125 milliontransistors, all squashed together into a very tight space. It is about 1 cm x 1 cm in size:
Fig. 34. Close up of a CPU circuit (die).The electronic circuit is encapsulated in a much bigger plastic square. This is in order to make room forall the electrical contacts which are used to connect the CPU to the motherboard.The individual contacts are called pins, and a CPU can have 478 of them (as does the Pentium 4 ). Thelarge number of pins means that the socket has to be relatively large.
Fig. 35. The underside of a (Pentium 4) CPU, showing the many pins.Which CPU?The companies Intel and AMD make most CPU’s. Intel laid the foundations for the development ofCPU’s for PCs with their more than 20 year old 8086 and 8088 processors.CPU’s are developed in series, or generations. Each series is known by its name. The last fourgenerations of Intel processors, for example, have been the Pentium, Pentium II, Pentium III andPentium 4. Running alongside these is the Celeron series, which are cheaper versions, typically withreduced L2 cache and a slower front side bus:Fig. 36. A Celeron processor supplied in a box from Intel, with heat sink and fan.Within each generation there are many variants with different clock frequencies. For example, whenthe Pentium 4 was released in the year 2000, it was as a 1400 MHz version. The original model waslater followed up by versions with 1800, 2000, etc. MHz, up to 2400 MHz (the clock frequencies camein intervals of 100 MHz). In the year 2002, a new model came out for which the clock frequenciesstarted at 2266, 2400 and 2533 MHz, and increased in intervals of 133 MHz. A year later the clockfrequencies was raised to intervals of 200 MHz with the Pentium 4 chips running from 2600 to 3600MHz. And so it continues.The company, AMD, produces similar processors in the Sempron and Athlon 64 series, which also comewith different clock frequencies.
Figur 37. The Pentium 4 socket 478 on a motherboard.Find your CPUIf you are not sure which CPU your PC uses, you can investigate this in several ways. You could checkyour purchase receipt. The name of the CPU should be specified there.You could look inside your PC and locate the CPU. But it is quite difficult to get to see the model name,because there is a fan mounted on the actual chip. The fan is often glued directly onto the processor,so that it is not easy to remove it.
Fig. 38. A CPU is shown here without a cooling fan. It is mounted in a small socket which it clicks intowithout needing any tools.In Windows, you can select the System Properties dialog box, where you can see the processor nameand clock frequency:
You can also watch carefully when your PC starts up. Your CPU name and clock frequency is shown asone of the first things displayed on the screen. You can press the P key to pause the startup process.Below you can see a picture of the startup screen for PC. This PC has an Intel Pentium 4, with a clockfrequency (work rate) of 2553 MHz:
Fig. 39. If you are not sure which CPU your PC uses, you can see it on the screen, shortly after youswitch on your PC.CPU testing programsFinally, let me just mention some small utility programs which you can download from the Internet (e.g. search for “WCPUID” or “CPU-Z” on www.google.com, and you’ll find it). The programs WCPUID andCPU-Z, reveals lots of information about your CPU, chipset, etc. They are used by motherboard nerds.
Figur 40. Here CPU-Z reports that the Pentium 4 processor is a ”Prescott” model. Due to HyperThreading, the processor virtually holds two cores.q Next chapter.q Previous chapter.
Copyright Michael Karbo and ELI Aps., Denmark, Europe.q Next chapter.q Previous chapter.Chapter 6. The CPU and the motherboardThe heart and soul of the PC’s data processing is the CPU. But the processor is not alone in the world,it communicates with the rest of the motherboard. There will be many new terms introduced in thefollowing sections, so remember that you can find definitions for all the abbreviations in the back of theguide.Busses do the transfersData packets (of 8, 16, 32, 64 or more bits at a time) are constantly being moved back and forthbetween the CPU and all the other components (RAM, hard disk, etc.). These transfers are all doneusing busses.The motherboard is designed around some vary powerful data channels (or pathways, as they are alsocalled). It is these busses which connect all the components to each other.Fig. 41. The busses are the data channels which connect the PC’s components together. Some aredesigned for small transfers, others for large ones.Busses with varying capacitiesThere is not just one bus on a motherboard; there are several. But they are all connected, so that datacan run from one to another, and hence reach the farthest corners of the motherboard.We can say that a bus system is subdivided into several branches. Some of the PC components workwith enormous amounts of data, while others manage with much less. For example, the keyboard onlysends very few bytes per second, whereas the working storage (RAM) can send and receive severalgigabytes per second. So you can’t attach RAM and the keyboard to the same bus.Two busses with different capacities (bandwidths) can be connected if we place a controller betweenthem. Such a controller is often called a bridge, since it functions as a bridge between the two different
traffic systems.Fig. 42. Bridges connect the various busses together.The entire bus system starts close to the CPU, where the load (traffic) is greatest. From here, thebusses work outwards towards the other components. Closest to the CPU we find the working storage.RAM is the component which has the very greatest data traffic, and is therefore connected directly tothe CPU by a particularly powerful bus. It is called the front side bus (FSB) or (in older systems) thesystem bus.Fig. 43. The PC’s most important bus looks after the “heavy” traffic between the CPU and RAM.The busses connecting the motherboard to the PC’s peripheral devices are called I/O busses. They aremanaged by the controllers.The chip setThe motherboard’s busses are regulated by a number of controllers. These are small circuits whichhave been designed to look after a particular job, like moving data to and from EIDE devices (harddisks, etc.).A number of controllers are needed on a motherboard, as there are many different types of hardwaredevices which all need to be able to communicate with each other. Most of these controller functionsare grouped together into a couple of large chips, which together comprise the chip set.
Fig. 44. The two chips which make up the chipset, and which connect the motherboard’s busses.The most widespread chipset architecture consists of two chips, usually called the north and southbridges. This division applies to the most popular chipsets from VIA and Intel. The north bridge andsouth bridge are connected by a powerful bus, which sometimes is called a link channel:Fig. 45. The north bridge and south bridge share the work of managing the data traffic on themotherboard.The north bridgeThe north bridge is a controller which controls the flow of data between the CPU and RAM, and to theAGP port.In Fig. 46 you can see the north bridge, which has a large heat sink attached to it. It gets hot becauseof the often very large amounts of data traffic which pass through it. All around the north bridge youcan see the devices it connects:
Fig. 46. The north bridge and its immediate surroundings. A lot of traffic runs through the northbridge, hence the heat sink.The AGP is actually an I/O port. It is used for the video card. In contrast to the other I/O devices, theAGP port is connected directly to the north bridge, because it has to be as close to the RAM as possible.The same goes for the PCI Express x16 port, which is the replacement of AGP in new motherboards.But more on that later.q Next chapter.q Previous chapter.
Copyright Michael Karbo and ELI Aps., Denmark, Europe.q Next chapter.q Previous chapter.Chapter 7. The south bridgeThe south bridge incorporates a number of different controller functions. It looks after the transfer ofdata to and from the hard disk and all the other I/O devices, and passes this data into the link channelwhich connects to the north bridge.In Fig. 44 you can clearly see that the south bridge is physically located close to the PCI slots, whichare used for I/O devices.Fig. 47. The chipset’s south bridge combines a number of controller functions into a single chip.The various chipset manufacturersOriginally it was basically only Intel who supplied the chipsets to be used in motherboards. This wasquite natural, since Intel knows everything about their own CPU’s and can therefore produce chipsetswhich match them. But at the time the Pentium II and III came out, other companies began to getinvolved in this market. The Taiwanese company, VIA, today produces chipsets for both AMD and Intelprocessors, and these are used in a large number of motherboards.Other companies (like SiS, nVidia, ATI and ALi) also produce chipsets, but these haven’t (yet?)achieved widespread use. The CPU manufacturer, AMD, produces some chipsets for their own CPU’s,but they also work together closely with VIA as the main supplier for Athlon motherboards.Fig. 48. The Taiwanese company, VIA, has been a leader in the development of new chipsets in recentyears.Since all data transfers are managed by the chipset’s two bridges, the chipset is the most importantindividual component on the motherboard, and new chipsets are constantly being developed.The chipset determines the limits for clock frequencies, bus widths, etc. The chipset’s built-in
controllers are also responsible for connecting I/O devices like hard disks and USB ports, thus thechipset also determines, in practise, which types of devices can be connected to the PC.Fig. 49. The two chips which make up a typical chipset. Here we have VIA’s model P4X266A, whichwas used in early motherboards for Pentium 4 processors.Sound, network, and graphics in chipsetsDevelopments in recent years have led chipset manufacturers to attempt to place more and morefunctions in the chipset.These extra functions are typically:q Video card (integrated into the north bridge)q Sound card (in the south bridge)q Modem (in the south bridge)q Network and Firewire (in the south bridge)All these functions have traditionally been managed by separate devices, usually plug-in cards, whichconnect to the PC. But it has been found that these functions can definitely be incorporated into thechipset.Fig. 50. Motherboard with built-in sound functionality.Intel has, for many years, managed to produce excellent network cards (Ethernet 10/100 Mbps); so itis only natural that they should integrate this functionality into their chipsets.
Sound facilities in a chipset cannot be compared with “real” sound cards (like, for example, SoundBlaster Audigy). But the sound functions work satisfactorily if you only want to connect a couple ofsmall speakers to the PC, and don’t expect perfect quality.Fig. 51. This PC has two sound cards installed, as shown in this Windows XP dialog box. The VIA AC’97is a sound card emulation which is built into the chipset.Many chipsets also come with a built-in video card. The advantage is clear; you can save having aseparate video card, which can cost a $100 or more.Again, the quality can’t be compared with what you get with a separate, high quality, video card. But ifyou don’t particularly need support for multiple screens, DVI (for flat screens), super 3D performancefor games, or TV-out, the integrated graphics controller can certainly do the job.Fig. 52. This PC uses a video card which is built into the Intel i810 chipset.It is important that the integrated sound and graphics functions can be disabled, so that you canreplace them with a real sound or video card. The sound functions won’t cause any problems; you canalways ask Windows to use a particular sound card instead of another one.But the first Intel chipset with integrated graphics (the i810) did not allow for an extra video card to beinstalled. That wasn’t very smart, because it meant users were locked into using the built-in videocard. In the subsequent chipset (i815), the problem was resolved.Buying a motherboardIf you want to build a PC yourself, you have to start by choosing a motherboard. It is the foundationfor the entire PC.Most of the motherboards on the market are produced in Taiwan, where manufacturers like Microstar,Asus, Epox, Soltek and many others supply a wide range of different models. Note that a producer likeMicrostar supplies motherboards to brand name manufacturers like Fujitsu-Siemens, so you cancomfortably trust in the quality. Taiwan is the leader in the area of motherboards.The first issue to work out is, which CPU you want to use. For example, if you want to use a Pentium 4
from Intel, there is one line of motherboards you can choose between. If you choose an AthlonXP,there is another line. And the difference lies in which chipset is being used in the motherboard.Fig. 53. A typical (technical) advertisement for a motherboard.Once you have decided on a processor, you should try to get a motherboard with the latest chipsetavailable, because new versions of chipsets continue to be released, with greater functionality. At thetime of writing, for example, chipsets often include these functions:q USB version 2.0.q Dual channel RAM.q Support for 400 and 533 MHz DDR2 RAM.q Integrated Firewire ports.q Serial ATA.q Surround sound.q Gigabit Ethernet.You will most likely want to have these facilities (which are described later in the guide) on your PC.That is why it is important to choose the right motherboard with the latest generation chipset.Extra facilitiesIn addition, it can sometimes be worth choosing a motherboard which has extra facilities. For example,all manufacturers have “luxury models” with built-in RAID controllers, making it possible to installseveral hard disks. There are motherboards around with loads of extra facilities, such as:q Built-in RAID or (seldom) SCSI controller.
q Other network, screen and sound facilities.q Wireless LAN.q SmartCard/MemoryStick/etc. readers.One of the advantages of building your own PC is that you can choose a really exciting motherboard.Development is taking place rapidly, and by choosing the right motherboard, you can design theabsolute latest PC on the market.You can also find hundreds of articles on the Internet about each motherboard and chipset. So I cancomfortably recommend you build your own PC, as long as you do your homework first! Make sure youread the rest of the guide before you start choosing a new motherboard!q Next chapter.q Previous chapter.
Copyright Michael Karbo and ELI Aps., Denmark, Europe.q Next chapter.q Previous chapter.Chapter 8. Inside and around the CPUIn this and the following chapters, I will focus on a detailed look at the CPU. One of the goals is help toyou understand why manufacturers keep releasing new and more powerful processors. In order toexplain that, we will have to go through what will at times be a quite detailed analysis of the CPU’sinner workings.Some of the chapters will probably be fairly hard to understand; I have spent a lot of time myself onmy “research”, but I hope that what I present in these chapters will shed some light on these topics.Naturally, I will spend most of my time on the latest processors (the Athlon XP and Pentium 4). But weneed to examine their internal architectures in light of the older CPU architectures, if we want tounderstand them properly. For this reason I will continually make comparisons across the variousgenerations of CPU’s.I will now take you on a trip inside the CPU. We will start by looking at how companies like Intel andAMD can continue to develop faster processors.Two ways to greater speedOf course faster CPU’s are developed as a result of hard work and lots of research. But there are twoquite different directions in this work:q More power and speed in the CPU, for example, from higher clock frequencies.q Better exploitation of existing processor power.Both approaches are used. It is a well-known fact that bottlenecks of various types drain the CPU of upto 75 % of its power. So if these can be removed or reduced, the PC can become significantly fasterwithout having to raise the clock frequency dramatically.It’s just that it is very complicated to remove, for example, the bottleneck surrounding the front sidebus, which I will show you later. So the manufacturers are forced to continue to raise the working rate(clock frequency), and hence to develop new process technology, so that CPU’s with more power cancome onto the market.Clock frequenciesIf we look at a CPU, the first thing we notice is the clock frequency. All CPU’s have a working speed,which is regulated by a tiny crystal.The crystal is constantly vibrating at a very large number of “beats” per second. For each clock tick, animpulse is sent to the CPU, and each pulse can, in principle, cause the CPU to perform one (or more)actions.
Fig. 54. The CPU’s working speed is regulated by a crystal which “oscillates” millions of times eachsecond.The number of clock ticks per second is measured in Hertz. Since the CPU’s crystal vibrates millions oftimes each second, the clock speed is measured in millions of oscillations (megahertz or MHz). ModernCPU’s actually have clock speeds running into billions of ticks per second, so we have started having touse gigahertz (GHz).These are unbelievable speeds. See for yourself how short the period of time is between individualclock ticks at these frequencies. We are talking about billionths of a second: Clock frequency Time period per clock tick 133 MHz 0.000 000 008 000 seconds 1200 MHz 0.000 000 000 830 seconds 2 GHz 0.000 000 000 500 secondsFig. 55. The CPU works at an incredible speed.The trend is towards ever increasing clock frequencies. Let’s take a closer look at how this is possible.More transistorsNew types of processors are constantly being developed, for which the clock frequency keeps gettingpushed up a notch. The original PC from 1981 worked at a modest 4.77 MHz, whereas the clockfrequency 20 years later was up to 2 GHz.In Fig. 56 you can see an overview of the last 20 years of development in this area. The table showsthe seven generations of Intel processors which have brought about the PC revolution. The latestversion of Pentium 4 is known under the code name Prescott. Gen. CPU Yr Clock No. of (intr.) Frequency transistors1 8088 1979 4.77- 8 MHz 29,0002 80286 1982 6-12.5 MHz 134,0003 80386 1985 16-33 MHz 275,000
4 80486 1989 25-100 MHz 1,200,000 5 Pentium 1993 60-200 MHz 3,100,000 Pentium 1997 166-300 MHz 4,500,000 MMX 6 Pentium Pro 1995 150-200 MHz 5,500,000 Pentium II 1997 233-450 MHz 7,500,000 Pentium III 1999 450-1200 MHz 28,000,000 7 Pentium 4 2000 1400-2200 42,000,000 2002 2200-2800 55,000,000 2003 2600-3200 55,000,000 “Prescott“ 2004 2800-3600 125,000,000Fig. 56. Seven generations of CPU’s from Intel. The number of transistors in the Pentium III and 4includes the L2 cache.Each processor has been on the market for several years, during which time the clock frequency hasincreased. Some of the processors were later released in improved versions with higher clockfrequencies, I haven’t included the Celeron in the overview processor. Celerons are specially discountversions of the Pentium II, III, and 4 processors.Anyone can see that there has been an unbelievable development. Modern CPU’s are one thousandtimes more powerful than the very first ones.In order for the industry to be able to develop faster CPU’s each year, new manufacturing methods arerequired. More and more transistors have to be squeezed into smaller and smaller chips. Fig. 57. A photograph from one of Intel’s factories, in which a technician displays the Pentium 4 processor core. It is a tiny piece of silicon which contains 42 million transistors.q Next chapter.q Previous chapter.
Copyright Michael Karbo and ELI Aps., Denmark, Europe.q Next chapter.q Previous chapter.Chapter 9. Moores LawThis development was actually described many years ago, in what we call Moores Law.Right back in 1965, Gordon Moorepredicted (in the Electronicsjournal), that the number oftransistors in processors (and hencetheir speed) would be able to bedoubled every 18 months.Moore expected that this regularitywould at least apply up until 1975.But he was too cautious; we cansee that the development continuesto follow Moores Law today, as isshown in Fig. 59. Fig. 58. In 1968, Gordon Moore helped found Intel.If we try to look ahead in time, we can work out that in 2010 we should have processors containing 3 billion transistors. Andwith what clock frequencies? You’ll have to guess that for yourself.Fig. 59. Moores Law (from Intels website).Process technologyThe many millions of transistors inside the CPU are made of, and connected by, ultra thin electronic tracks. By making theseelectronic tracks even narrower, even more transistors can be squeezed into a small slice of silicon.The width of these electronic tracks is measured in microns (or micrometers), which are millionths of a metre.For each new CPU generation, the track width is reduced, based on new technologies which the chip manufacturers keepdeveloping. At the time of writing, CPU’s are being produced with a track width of 0.13 microns, and this will be reduced to0.09 and 0.06 microns in the next generations.
Fig. 60. CPU’s are produced in extremely high-technology environments (“clean rooms”). Photo courtesy of AMD.In earlier generations aluminium was used for the current carrying tracks in the chips. With the change to 0.18 and 0.13-micron technology, aluminium began to be replaced with copper. Copper is cheaper, and it carries current better thanaluminium. It had previously been impossible to insulate the copper tracks from the surrounding silicon, but IBM solved thisproblem in the late 1990’s.AMD became the first manufacturer to mass-produce CPU’s with copper tracks in their chip factory fab 30 in Dresden,Germany. A new generation of chips requires new chip factories (fabs) to produce it, and these cost billions of dollars to build.That’s why they like a few years to pass between each successive generation. The old factories have to have time to pay forthemselves before new ones start to be used.Fig. 61. AMD’s Fab 30 in Dresden, which was the first factory to mass-produce copper-based CPU’s.A grand new world …We can expect a number of new CPU’s in this decade, all produced in the same way as they are now – just with smaller trackwidths. But there is no doubt that we are nearing the physical limits for how small the transistors produced using the existingtechnology can be. So intense research is underway to find new materials, and it appears that nanotransistors, produced usingorganic (carbon-based) semiconductors, could take over the baton from the existing process technology.Bell Labs in the USA has produced nanotransistors with widths of just one molecule. It is claimed that this process can be usedto produce both CPU’s and RAM circuits up to 1000 times smaller than what we have today!
Less power consumptionThe types of CPU’s we have today use a fairly large amount of electricity when the PC is turned on and is processing data. Theprocessor, as you know, is installed in the motherboard, from which it receives power. There are actually two different voltagelevels, which are both supplied by the motherboard:q One voltage level which powers the CPU core (kernel voltage).q Another voltage level which powers the CPU’s I/O ports, which is typically 3.3 volts.As the track width is reduced, more transistors can be placed within the same area, and hence the voltage can be reduced.As a consequence of the narrower process technology, the kernel voltage has been reduced from 3 volts to about 1 volt inrecent years. This leads to lower power consumption per transistor. But since the number of transistors increases by acorresponding amount in each new CPU generation, the end result is often that the total power consumption is unchanged.Fig. 62. A powerful fan. Modern CPU’s require something like this.It is very important to cool the processor; a CPU can easily burn 50-120 Watts. This produces a fair amount of heat in a verysmall area, so without the right cooling fan and motherboard design, a Gigahertz processor could quickly burn out.Modern processors contain a thermal diode which can raise the alarm if the CPU gets to hot. If the motherboard and BIOS aredesigned to pay attention to the diode’s signal, the processor can be shut down temporarily so that it can cool down.
Figur 63. The temperatures on the motherboard are constantly reported to this program..Cooling is a whole science in itself. Many “nerds” try to push CPU’s to work at higher clock speeds than they are designed for.This is often possible, but it requires very good cooling – and hence often huge cooling units.30 years developmentHigher processor speeds require more transistors and narrower electronic tracks in the silicon chip. In the overview in Fig. 64you can see the course of developments in this area.Note that the 4004 processor was never used for PC’s. The 4004 was Intel’s first commercial product in 1971, and it laid thefoundation for all their later CPU’s. It was a 4-bit processor which worked at 108 KHz (0.1 MHz), and contained 2,250transistors. It was used in the first pocket calculators, which I can personally remember from around 1973-74 when I was athigh school. No-one could have predicted that the device which replaced the slide rule, could develop, in just 30 years, into aPentium 4 based super PC.If, for example, the development in automobile technology had been just as fast, we would today be able to drive fromCopenhagen to Paris in just 2.8 seconds! Year Intel CPU Technology (track width) 1971 4004 10 microns 1979 8088 3 microns 1982 80286 1.5 microns 1985 80386 1 micron 1989 80486 1.0/0.8 microns
1993 Pentium 0.8/0.5/0.35 microns 1997 Pentium II 0.28/0.25 microns 1999 Pentium III 0.25/0.18/0.13 microns 2000-2003 Pentium 4 0.18/0.13 microns 2004-2005 Pentium 4 0.09 microns ”Prescott”Fig. 64. The high clock frequencies are the result of new process technology with smaller electronic ”tracks”.A conductor which is 0.09 microns (or 90 nanometres) thick, is 1150 times thinner than a normal human hair. These are tinythings we are talking about here.Wafers and die sizeAnother CPU measurement is its die size. This is the size of the actual silicon sheet containing all the transistors (the tiny areain the middle of Fig. 33 on page 15).At the chip factories, the CPU cores are produced in so-called wafers. These are round silicon sheets which typically contain 150-200 processor cores (dies).The smaller one can make each die, the more economical production can become. A big die is also normally associated withgreater power consumption and hence also requires cooling with a powerful fan (e.g. see Fig. 63 on page 25 and Fig. 124 onpage 50).
Figur 65. A technician from Intel holding a wafer. This slice of silicon contains hundreds of tiny processor cores, which end upas CPU’s in everyday PC’s.You can see the measurements for a number of CPU’s below. Note the difference, for example, between a Pentium and aPentium II. The latter is much smaller, and yet still contains nearly 2½ times as many transistors. Every reduction in die size iswelcome, since the smaller this is, the more processors can fit on a wafer. And that makes production cheaper. CPU Track width Die size Number of transistors Pentium 0.80 294 mm2 3.1 mil. Pentium MMX 0.28 140 mm2 4.5 mil. Pentium II 0.25 131 mm2 7.5 mil. Athlon 0.25 184 mm2 22 mil. Pentium III 0.18 106 mm2 28 mil. Pentium III 0.13 80 mm2 28 mil. Athlon XP 0.18 128 mm2 38 mil. Pentium 4 0.18 217 mm2 42 mil. Pentium 4 0.13 145 mm2 55 mil. Athlon XP+ 0.13 115 mm2 54 mil. Athlon 64 FX 0,13 193 mm2 106 mill. Pentium 4 0.09 112 mm2 125 mil.Fig. 66. The smaller the area of each processor core, the more economical chip production can be.The modern CPU generationsAs mentioned earlier, the various CPU’s are divided into generations (see also Fig. 56 on page 23).At the time of writing, we have started on the seventh generation. Below you can see the latest processors from Intel and AMD,divided into these generations. The transitions can be a bit hazy. For example, I’m not sure whether AMD’s K6 belongs to the5th or the 6th generation. But as a whole, the picture is as follows: Generation CPU’s 5th Pentium, Pentium MMX, K5, K6 6th Pentium Pro, K6-II, Pentium II, K6-3, Athlon, Pentium III 7th Pentium 4, Athlon XP 8th. Athlon 64 FX, Pentium 5
Fig. 67. The latest generations of CPU’s.q Next chapter.q Previous chapter.
Copyright Michael Karbo and ELI Aps., Denmark, Europe.q Next chapter.q Previous chapter.Chapter 10. The cacheIn the previous chapter, I described two aspects of the ongoing development of new CPU’s – increasedclock frequencies and the increasing number of transistors being used. Now it is time to look at a verydifferent yet related technology – the processor’s connection to the RAM, and the use of the L1 and L2caches.Speed conflictThe CPU works internally at very high clock frequencies (like 3200 MHz), and no RAM can keep up withthese.The most common RAM speeds are between 266 and 533 MHz. And these are just a fraction of theCPU’s working speed. So there is a great chasm between the machine (the CPU) which slaves away atperhaps 3200 MHz, and the “conveyor belt”, which might only work at 333 MHz, and which has to shipthe data to and from the RAM. These two subsystems are simply poorly matched to each other.If nothing could be done about this problem, there would be no reason to develop faster CPU’s. If theCPU had to wait for a bus, which worked at one sixth of its speed, the CPU would be idle five sixths ofthe time. And that would be pure waste.The solution is to insert small, intermediate stores of high-speed RAM. These buffers (cache RAM)provide a much more efficient transition between the fast CPU and the slow RAM. Cache RAM operatesat higher clock frequencies than normal RAM. Data can therefore be read more quickly from the cache.Data is constantly being movedThe cache delivers its data to the CPU registers. These are tiny storage units which are placed rightinside the processor core, and they are the absolute fastest RAM there is. The size and number of theregisters is designed very specifically for each type of CPU.
Fig. 68. Cache RAM is much faster than normal RAM.The CPU can move data in different sized packets, such as bytes (8 bits), words (16 bits), dwords (32bits) or blocks (larger groups of bits), and this often involves the registers. The different data packetsare constantly moving back and forth:q from the CPU registers to the Level 1 cache.q from the L1 cache to the registers.q from one register to anotherq from L1 cache to L2 cache, and so on…The cache stores are a central bridge between the RAM and the registers which exchange data with theprocessor’s execution units.The optimal situation is if the CPU is able to constantly work and fully utilize all clock ticks. This wouldmean that the registers would have to always be able to fetch the data which the execution unitsrequire. But this it not the reality, as the CPU typically only utilizes 35% of its clock ticks. However,without a cache, this utilization would be even lower.BottlenecksCPU caches are a remedy against a very specific set of “bottleneck” problems. There are lots of“bottlenecks” in the PC – transitions between fast and slower systems, where the fast device has towait before it can deliver or receive its data. These bottle necks can have a very detrimental effect onthe PC’s total performance, so they must be minimised.Fig. 69. A cache increases the CPU’s capacity to fetch the right data from RAM.The absolute worst bottleneck exists between the CPU and RAM. It is here that we have the heaviestdata traffic, and it is in this area that PC manufacturers are expending a lot of energy on newdevelopment. Every new generation of CPU brings improvements relating to the front side bus.The CPU’s cache is “intelligent”, so that it can reduce the data traffic on the front side bus. The cachecontroller constantly monitors the CPU’s work, and always tries to read in precisely the data the CPUneeds. When it is successful, this is called a cache hit. When the cache does not contain the desireddata, this is called a cache miss.Two levels of cacheThe idea behind cache is that it should function as a “near store” of fast RAM. A store which the CPUcan always be supplied from.
In practise there are always at least two close stores. They are called Level 1, Level 2, and (ifapplicable) Level 3 cache. Some processors (like the Intel Itanium) have three levels of cache, butthese are only used for very special server applications. In standard PC’s we find processors with L1and L2 cache.Fig. 70. The cache system tries to ensure that relevant data is constantly being fetched from RAM, sothat the CPU (ideally) never has to wait for data.L1 cacheLevel 1 cache is built into the actual processor core. It is a piece of RAM, typically 8, 16, 20, 32, 64 or128 Kbytes, which operates at the same clock frequency as the rest of the CPU. Thus you could say theL1 cache is part of the processor.L1 cache is normally divided into two sections, one for data and one for instructions. For example, anAthlon processor may have a 32 KB data cache and a 32 KB instruction cache. If the cache is commonfor both data and instructions, it is called a unified cache.q Next chapter.q Previous chapter.
Copyright Michael Karbo and ELI Aps., Denmark, Europe.q Next chapter.q Previous chapter.Chapter 11. The L2 cacheThe level 2 cache is normally much bigger (and unified), such as 256, 512 or 1024 KB. The purpose ofthe L2 cache is to constantly read in slightly larger quantities of data from RAM, so that these areavailable to the L1 cache.In the earlier processor generations, the L2 cache was placed outside the chip: either on themotherboard (as in the original Pentium processors), or on a special module together with the CPU (as inthe first Pentium II’s).Fig. 71. An old Pentium II module. The CPU is mounted on a rectangular printed circuit board, togetherwith the L2 cache, which is two chips here. The whole module is installed in a socket on themotherboard. But this design is no longer used.As process technology has developed, it has become possible to make room for the L2 cache inside theactual processor chip. Thus the L2 cache has been integrated and that makes it function much better inrelation to the L1 cache and the processor core.The L2 cache is not as fast as the L1 cache, but it is still much faster than normal RAM. CPU L2 cache Pentium, K5, K6 External, on the motherboard Pentium Pro Internal, in the CPU
Pentium II, Athlon External, in a module close to the CPU Celeron (1st generation) None Celeron (later gen.), Internal, in the CPU Pentium III, Athlon XP, Duron, Pentium 4Fig. 72. It has only been during the last few CPU generations that the level 2 cache has found its place,integrated into the actual CPU.Traditionally the L2 cache is connected to the front side bus. Through it, it connects to the chipset’snorth bridge and RAM:Fig. 73. The way the processor uses the L1 and L2 cache has crucial significance for its utilisation of thehigh clock frequencies.The level 2 cache takes up a lot of the chip’s die, as millions of transistors are needed to make a largecache. The integrated cache is made using SRAM (static RAM), as opposed to normal RAM which isdynamic (DRAM).While DRAM can be made using one transistor per bit (plus a capacitor), it costs 6 transistors (or more)to make one bit of SRAM. Thus 256 KB of L2 cache would require more than 12 million transistors. Thusit has only been since fine process technology (such as 0.13 and 0.09 microns) was developed that itbecame feasible to integrate a large L2 cache into the actual CPU. In Fig. 66 on page 27, the number oftransistors includes the CPU’s integrated cache.Powerful busThe bus between the L1 and L2 cache is presumably THE place in the processor architecture which hasthe greatest need for high bandwidth. We can calculate the theoretical maximum bandwidth bymultiplying the bus width by the clock frequency. Here are some examples: CPU Bus Clock Theoretical width frequency bandwidth Intel Pentium III 64 bits 1400 MHz 11.2 GB/sek. AMD 64 bits 2167 MHz 17.3 GB/sek. Athlon XP+
AMD Athlon 64 64 bits 2200 MHz 17,6 GB/sek. AMD Athlon 64 FX 128 bits 2200 MHz 35,2 GB/sek. Intel Pentium 4 256 bits 3200 MHz 102 GB/sek.Fig. 74. Theoretical calculations of the bandwidth between the L1 and L2 cache.Different systemsThere are a number of different ways of using caches. Both Intel and AMD have saved on L2 cache insome series, in order to make cheaper products. But there is no doubt, that the better the cache – bothL1 and L2 – the more efficient the CPU will be and the higher its performance.AMD have settled on a fairly large L1 cache of 128 KB, while Intel continue to use relatively small (butefficient) L1 caches.On the other hand, Intel uses a 256 bit wide bus on the “inside edge” of the L2 cache in the Pentium 4,while AMD only has a 64-bit bus (see Fig. 74).Fig. 75. Competing CPU’s with very different designs.AMD uses exclusive caches in all their CPU’s. That means that the same data can’t be present in bothcaches at the same time, and that is a clear advantage. It’s not like that at Intel.However, the Pentium 4 has a more advanced cache design with Execution Trace Cache making up 12KB of the 20 KB Level 1 cache. This instruction cache works with coded instructions, as described onpage 35. CPU L1 cache L2 cache Athlon XP 128 KB 256 KB Athlon XP+ 128 KB 512 KB Pentium 4 (I) 20 KB 256 KB Pentium 4 (II, “Northwood”) 20 KB 512 KB Athlon 64 128 KB 512 KB Athlon 64 FX 128 KB 1024 KB
Pentium 4 (III, “Prescott”) 28 KB 1024 KBFig. 76. The most common processors and their caches.LatencyA very important aspect of all RAM – cache included – is latency. All RAM storage has a certain latency,which means that a certain number of clock ticks (cycles) must pass between, for example, two reads.L1 cache has less latency than L2; which is why it is so efficient.When the cache is bypassed to read directly from RAM, the latency is many times greater. In Fig. 77 thenumber of wasted clock ticks are shown for various CPU’s. Note that when the processor core has tofetch data from the actual RAM (when both L1 and L2 have failed), it costs around 150 clock ticks. Thissituation is called stalling and needs to be avoided.Note that the Pentium 4 has a much smaller L1 cache than the Athlon XP, but it is significantly faster. Itsimply takes fewer clock ticks (cycles) to fetch data: Latency Pentium II Athlon Pentium 4 L1 cache: 3 cycles 3 cycles 2 cycles L2 cache: 18 cycles 6 cycles 5 cyclesFig. 77. Latency leads to wasted clock ticks; the fewer there are of these, the faster the processor willappear to be.Intelligent ”data prefetch”In CPU’s like the Pentium 4 and Athlon XP, a handful of support mechanisms are also used which work inparallel with the cache. These include:A hardware auto data prefetch unit, which attempts to guess which data should be read into the cache.This device monitors the instructions being processed and predicts what data the next job will need.Related to this is the Translation Look-aside Buffer, which is also a kind of cache. It contains informationwhich constantly supports the supply of data to the L1 cache, and this buffer is also being optimised innew processor designs. Both systems contribute to improved exploitation of the limited bandwidth in thememory system.
Fig. 78. The WCPUID program reports on cache in an Athlon processor.ConclusionL1 and L2 cache are important components in modern processor design. The cache is crucial for theutilisation of the high clock frequencies which modern process technology allows. Modern L1 caches areextremely effective. In about 96-98% of cases, the processor can find the data and instructions it needsin the cache. In the future, we can expect to keep seeing CPU’s with larger L2 caches and moreadvanced memory management. As this is the way forward if we want to achieve more effectiveutilisation of the CPU’s clock ticks. Here is a concrete example:In January 2002 Intel released a new version of their top processor, the Pentium 4 (with the codename,“Northwood”). The clock frequency had been increased by 10%, so one might expect a 10%improvement in performance. But because the integrated L2 cache was also doubled from 256 to 512KB, the gain was found to be all of 30%. CPU L2 cache Clock freq. Improvement Intel Pentium 4 256 KB 2000 MHz (0.18 micron) Intel Pentium 4 512 KB 2200 MHz +30% (0.13 micron)Fig. 79. Because of the larger L2 cache, performance increased significantly.In 2002 AMD updated the Athlon processor with the new ”Barton” core. Here the L2 cache was alsodoubled from 256 to 512 KB in some models. In 2004 Intel came with the “Prescott” core with 1024 KBL2 cache, which is the same size as in AMD’s Athlon 64 processors. Some Extreme Editions of Pentium 4even uses 2 MB of L2 cache.Xeon for serversIntel produces special server models of their Pentium III and Pentium 4 processors. These are calledXeon, and are characterised by very large L2 caches. In an Intel Xeon the 2 MB L2 cache uses149,000,000 transistors.Xeon processors are incrediblyexpensive (about Euro 4,000 for the topmodels), so they have never achievedwidespread distribution.They are used in high-end servers, inwhich the CPU only accounts for a smallpart of the total price.Otherwise, Intel’s 64 bit server CPU, the Itanium. The processor is supplied in modules which include 4MB L3 cache of 300 million transistors.MultiprocessorsSeveral Xeon processors can be installed on the same motherboard, using special chipsets. Byconnecting 2, 4 or even 8 processors together, you can build a very powerful computer.
These MP (Multiprocessor) machines are typically used as servers, but can also be used as powerfulworkstations, for example, to perform demanding 3D graphics and animation tasks. AMD has theOpteron processors, which are server-versions of the Athlon 64. Not all software can make use of thePC’s extra processors; the programs have to be designed to do so. For example, there are professionalversions of Windows NT, 2000 and XP, which support the use of several processors in one PC.See also the discussion of Hyper Threading, which allows a Pentium 4 processor to appear as an MPsystem. Both Intel and AMD also works on dual-core processors.q Next chapter.q Previous chapter.
Copyright Michael Karbo and ELI Aps., Denmark, Europe.q Next chapter.q Previous chapter.Chapter 12. Data and instructionsNow it’s time to look more closely at the work of the CPU. After all, what does it actually do?Instructions and dataOur CPU processes instructions and data. It receives orders from the software. The CPU is fed a gentlestream of binary data via the RAM.These instructions can also be called program code. They include the commands which you constantly– via user programs – send to your PC using your keyboard and mouse. Commands to print, save,open, etc.Data is typically user data. Think about that email you are writing. The actual contents (the text, theletters) is user data. But when you and your software say “send”, your are sending program code(instructions) to the processor:Fig. 80. The instructions process the user data.Instructions and compatibilityInstructions are binary code which the CPU can understand. Binary code (machine code) is themechanism by which PC programs communicate with the processor.All processors, whether they are in PC’s or other types of computers, work with a particular instructionset. These instructions are the language that the CPU understands, and thus all programs have tocommunicate using these instructions. Here is a simplified example of some “machine code” –instructions written in the language the processor understands: proc near
mov AX,01 mov BX,01 inc AX add BX,AXYou can no doubt see that it wouldn’t be much fun to have to use these kinds of instructions in order towrite a program. That is why people use programming tools. Programs are written in a programminglanguage (like Visual Basic or C++). But these program lines have to be translated into machine code,they have to be compiled, before they can run on a PC. The compiled program file contains instructionswhich can be understood by the particular processor (or processor family) the program has been“coded” for:Fig. 81. The program code produced has to match the CPU’s instruction set. Otherwise it cannot berun.The processors from AMD and Intel which we have been focusing on in this guide, are compatible, inthat they understand the same instructions.There can be big differences in the way two processors, such as the Pentium and Pentium 4, processthe instructions internally. But externally – from the programmer’s perspective – they all basicallyfunction the same way. All the processors in the PC family (regardless of manufacturer) can executethe same instructions and hence the same programs.And that’s precisely the advantage of the PC: Regardless of which PC you have, it can run the Windowsprograms you want to use.Fig. 82. The x86 instruction set is common to all PC’s.
As the years have passed, changes have been made in the instruction set along the way. A PC with aPentium 4 processor from 2002 can handle very different applications to those which an IBM XT withan 8088 processor from 1985 can. But on the other hand, you can expect all the programs which couldrun on the 8088, to still run on a Pentium 4 and on a Athlon 64. The software is backwards compatible.The entire software industry built up around the PC is based on the common x86 instruction, whichgoes back to the earliest PC’s. Extensions have been made, but the original instruction set from 1979 isstill being used.x86 and CISCPeople sometimes differentiate between RISC and CISC based CPU’s. The (x86) instruction set of theoriginal Intel 8086 processor is of the CISC type, which stands for Complex Instruction Set Computer.That means that the instructions are quite diverse and complex. The individual instructions vary inlength from 8 to 120 bits. It is designed for the 8086 processor, with just 29,000 transistors. Theopposite of CISC, is RISC instructions.RISC stands for Reduced Instruction Set Computer, which is fundamentally a completely different typeof instruction set to CISC. RISC instructions can all have the same length (e.g. 32 bits). They cantherefore be executed much faster than CISC instructions. Modern CPU’s like the AthlonXP and Pentium4 are based on a mixture of RISC and CISC.Fig. 83. PC’s running Windows still work with the old fashioned CISC instructions.In order to maintain compatibility with the older DOS/Windows programs, the later CPU’s stillunderstand CISC instructions. They are just converted to shorter, more RISC-like, sub-operations(called micro-ops), before being executed. Most CISC instructions can be converted into 2-3 micro-ops.