Gurpasset

440 views

Published on

  • Be the first to comment

  • Be the first to like this

Gurpasset

  1. 1. Harnessing Moore’s Law (with Selected Implications) Mark D. Hill Computer Sciences Department University of Wisconsin-Madison http://www. cs . wisc . edu /~ markhill This talk is based, in part, on an essay I wrote as part of a National Academy of Sciences study panel.
  2. 2. Motivation <ul><li>What the do the following intervals have in common? </li></ul><ul><ul><li>Prehistory-2003 </li></ul></ul><ul><ul><li>2004-2005 </li></ul></ul><ul><li>Answer: Equal progress in absolute computer speed </li></ul><ul><li>Furthermore, more doublings in 2006-07, 2008-09, … </li></ul><ul><li>Questions </li></ul><ul><ul><li>Why do computers get better and cheaper? </li></ul></ul><ul><ul><li>How do computer architects contribute (my bias)? </li></ul></ul><ul><ul><li>How to learn to project future trends and implications? </li></ul></ul>
  3. 3. Outline <ul><li>Computer Primer </li></ul><ul><ul><li>Software </li></ul></ul><ul><ul><li>Hardware </li></ul></ul><ul><li>Technology Primer </li></ul><ul><li>Harnessing Moore’s Law </li></ul><ul><li>Future Trends </li></ul>
  4. 4. Computer Primer: Software <ul><li>Application programmers write software: </li></ul><ul><li>int main (int argc, char *argv[]) </li></ul><ul><li>{ </li></ul><ul><li>int i; </li></ul><ul><li>int sum = 0; </li></ul><ul><li>for (i = 0; i <= 100; i++) sum = sum + i * i ; </li></ul><ul><li>printf (“The sum from 0 .. 100 is %d ”, sum); </li></ul><ul><li>} </li></ul><ul><li>[Example due to Jim Larus] </li></ul>
  5. 5. Computer Primer: Software, cont. <ul><li>System software translates for hardware: </li></ul><ul><li>.main: ... </li></ul><ul><li>loop: lw $14, 28($sp) </li></ul><ul><li>mul $15, $14, $14 <--- multiply i * i </li></ul><ul><li>lw $24, 24($sp) </li></ul><ul><li>addu $25, $24, $15 <--- add to sum </li></ul><ul><li>sw $25, 24($sp) </li></ul><ul><li>addu $8, $14, 1 </li></ul><ul><li>sw $8, 28($sp) </li></ul><ul><li>ble $8, 100, loop </li></ul><ul><li>la $4, str </li></ul><ul><li>lw $5, 24($sp) </li></ul><ul><li>jal printf </li></ul><ul><li>move $2, $0 </li></ul><ul><li>lw $31, 20($sp) </li></ul><ul><li>addu $sp, 32 </li></ul><ul><li>j $31 </li></ul>
  6. 6. Computer Primer: Software, cont. <ul><li>What the hardware really sees: </li></ul><ul><li>… </li></ul><ul><li>10001111101011100000000000011100 </li></ul><ul><li>10001111101110000000000000011000 </li></ul><ul><li>00000001110011100000000000011001 <--- multiply i * i </li></ul><ul><li>00100101110010000000000000000001 </li></ul><ul><li>00101001000000010000000001100101 </li></ul><ul><li>10101111101010000000000000011100 </li></ul><ul><li>00000000000000000111100000010010 </li></ul><ul><li>00000011000011111100100000100001 <--- add to sum </li></ul><ul><li>00010100001000001111111111110111 </li></ul><ul><li>10101111101110010000000000011000 </li></ul><ul><li>00111100000001000001000000000000 </li></ul><ul><li>10001111101001010000000000011000 </li></ul><ul><li>00001100000100000000000011101100 </li></ul><ul><li>00100100100001000000010000110000 </li></ul><ul><li>10001111101111110000000000010100 </li></ul><ul><li>00100111101111010000000000100000 </li></ul><ul><li>00000011111000000000000000001000 </li></ul><ul><li>00000000000000000001000000100001 </li></ul>
  7. 7. Computer Primer: Hardware Components <ul><li>Processor </li></ul><ul><ul><li>Rapidly executes instructions </li></ul></ul><ul><ul><li>Commonly: Processor implemented </li></ul></ul><ul><ul><li>as microprocessor chip (Intel Pentium 4) </li></ul></ul><ul><ul><li>Larger computers have multiple processors </li></ul></ul><ul><li>Memory </li></ul><ul><ul><li>Stores vast quantities of instructions and data </li></ul></ul><ul><ul><li>Commonly: DRAM chips backed by magnetic disks </li></ul></ul><ul><li>Input/Output </li></ul><ul><ul><li>Connect compute to outside world </li></ul></ul><ul><ul><li>E.g., keyboards, displays, & network interfaces </li></ul></ul>
  8. 8. Apple Mac 7200 (from Hennessy & Patterson) (C) Copyright 1998 Morgan Kaufmann Publishers. Reproduced with permission from Computer Organization and Design: The Hardware/Software Interface, 2E.
  9. 9. Computer Primer: Hardware Operation <ul><li>E.g., do mul temp,i,i & go on to next instruction </li></ul><ul><li>Fetch-Execute Loop { </li></ul><ul><ul><li>S1: read “current” instruction from memory </li></ul></ul><ul><ul><li>S2: decode instruction to see what is to be done </li></ul></ul><ul><ul><li>S3: read instruction input(s) </li></ul></ul><ul><ul><li>S4: perform instruction operation </li></ul></ul><ul><ul><li>S5: write instruction output(s) </li></ul></ul><ul><ul><li>Also determine “next” instruction and make it “current” </li></ul></ul><ul><li>} Repeat </li></ul>
  10. 10. Computer Big Picture <ul><li>Separate Software & Hardware (divide & conquer) </li></ul><ul><li>Software </li></ul><ul><ul><li>Worry about applications only (hardware can already exist) </li></ul></ul><ul><ul><li>Translate from one form to another (instructions & data interchangeable!) </li></ul></ul><ul><li>Hardware </li></ul><ul><ul><li>Expose set of instructions (most functionally equivalent) </li></ul></ul><ul><ul><li>Execute instructions rapidly (without regard for software) </li></ul></ul>
  11. 11. Outline <ul><li>Computer Primer </li></ul><ul><li>Technology Primer </li></ul><ul><ul><li>Exponential Growth </li></ul></ul><ul><ul><li>Technology Background </li></ul></ul><ul><ul><li>Moore’s Law </li></ul></ul><ul><li>Harnessing Moore’s Law </li></ul><ul><li>Future Trends </li></ul>
  12. 12. Exponential Growth <ul><li>Occurs when growth is proportional to current size </li></ul><ul><li>Mathematically: dy / dt = k * y </li></ul><ul><li>Solution: y = e k*t </li></ul><ul><li>E.g., a bond with $100 principal yielding 10% interest </li></ul><ul><li>1 year: $110 = $100 * (1 + 0.10) </li></ul><ul><li>2 years: $121 = $100 * (1 + 0.10) * (1 + 0.10) </li></ul><ul><li>… </li></ul><ul><li>8 years: $214 = $100 * (1 + 0.10) 8 </li></ul><ul><li>Other examples </li></ul><ul><ul><li>Unconstrained population growth </li></ul></ul><ul><ul><li>Moore’s Law </li></ul></ul>
  13. 13. Absurd Exponential Example <ul><li>Parameters </li></ul><ul><ul><li>$16 base </li></ul></ul><ul><ul><li>59% growth/year </li></ul></ul><ul><ul><li>36 years </li></ul></ul><ul><li>1 st year’s $16  buy book </li></ul><ul><li>3 rd year’s $64  buy computer game </li></ul><ul><li>15 th year’s $16,000  buy car </li></ul><ul><li>24 th year’s $100,000  buy house </li></ul><ul><li>36 th year’s $300,000,000  buy a lot </li></ul>
  14. 14. Technology Background <ul><li>Computer logic implemented with switches </li></ul><ul><ul><li>Like light switches, except that a switch can control others </li></ul></ul><ul><ul><li>Yields a network (called circuit ) of switches </li></ul></ul><ul><ul><li>Want circuits to be fast, reliable, & cheap </li></ul></ul><ul><li>Logic Technologies </li></ul><ul><ul><li>Mechanical switch & vacuum tube </li></ul></ul><ul><ul><li>Transistor (1947) </li></ul></ul><ul><ul><li>Integrated circuit ( chip ): circuit of many transistors made at once (1958) </li></ul></ul><ul><li>(Also memory & communication technologies) </li></ul>
  15. 15. (Technologist’s) Moore’s Law <ul><li>Parameters </li></ul><ul><ul><li>16 transistor/chip circa 1964 </li></ul></ul><ul><ul><li>59% growth/year </li></ul></ul><ul><ul><li>36 years (2000) and counting </li></ul></ul><ul><li>1 st year’s 16  ??? </li></ul><ul><li>3 rd year’s 64  ??? </li></ul><ul><li>15 th year’s 16,000  ??? </li></ul><ul><li>24 th year’s 100,000  ??? </li></ul><ul><li>36 th year’s 300,000,000  ??? </li></ul><ul><li>Was useful & then got more than 1,000,000 times better! </li></ul>
  16. 16. (Technologist’s) Moore’s Law Data
  17. 17. Other “Moore’s Laws” <ul><li>Other technologies improving rapidly </li></ul><ul><ul><li>Magnetic disk capacity </li></ul></ul><ul><ul><li>DRAM capacity </li></ul></ul><ul><ul><li>Fiber-optic network bandwidth </li></ul></ul><ul><li>Other aspects improving slowly </li></ul><ul><ul><li>Delay to memory </li></ul></ul><ul><ul><li>Delay to disk </li></ul></ul><ul><ul><li>Delay across networks </li></ul></ul><ul><li>Computer Implementor’s Challenge </li></ul><ul><ul><li>Design with dissimilarly expanding resources </li></ul></ul><ul><ul><li>To Double computer performance every two years </li></ul></ul><ul><ul><li>A.k.a., (Popular) Moore’s Law </li></ul></ul>
  18. 18. Outline <ul><li>Computer Primer </li></ul><ul><li>Technology Primer </li></ul><ul><li>Harnessing Moore’s Law </li></ul><ul><ul><li>Microprocessor </li></ul></ul><ul><ul><li>Bit-Level Parallelism </li></ul></ul><ul><ul><li>Instruction-Level Parallelism </li></ul></ul><ul><ul><li>Caching & Memory Hierarchies </li></ul></ul><ul><ul><li>Cost & Implications </li></ul></ul><ul><li>Future Trends </li></ul>
  19. 19. Microprocessor <ul><li>Computers for the 1960s expensive, using 100s if not 1000s of chips </li></ul><ul><li>First Microprocessor in 1971 </li></ul><ul><ul><li>Processor on one chip </li></ul></ul><ul><ul><li>Intel 4004 </li></ul></ul><ul><ul><li>2300 transistors </li></ul></ul><ul><ul><li>Barely a processor </li></ul></ul><ul><ul><li>Could access 300 bytes of memory (0.0003 megabytes) </li></ul></ul><ul><li>Use more and faster transistor in parallel </li></ul>
  20. 20. Transistor Parallelism <ul><li>To use more transistor quickly, </li></ul><ul><ul><li>use them side-by-side (or in parallel) </li></ul></ul><ul><ul><li>Approach depend on scale </li></ul></ul><ul><li>Consider organizing people </li></ul><ul><ul><li>10 people </li></ul></ul><ul><ul><li>1000 people </li></ul></ul><ul><ul><li>1,000,000 people </li></ul></ul><ul><li>Transistors </li></ul><ul><ul><li>Bit-level parallelism </li></ul></ul><ul><ul><li>Instuction-level parallelism </li></ul></ul><ul><ul><li>(Thread-level parallelism) </li></ul></ul>
  21. 21. Bit-Level Parallelism <ul><li>Less (e.g., 8 * 15 = 120): </li></ul><ul><ul><li>00001000 * 00001111 = </li></ul></ul><ul><ul><li>00001000 </li></ul></ul><ul><ul><li>00001000 </li></ul></ul><ul><ul><li>00001000 </li></ul></ul><ul><ul><li>00001000 </li></ul></ul><ul><ul><li>------------ </li></ul></ul><ul><ul><li>00001111000 </li></ul></ul><ul><li>More: </li></ul><ul><ul><li>010101010101010101010101 * 000011110000111100001111 = </li></ul></ul><ul><ul><li>1010000010100000100111110101111101011111011 </li></ul></ul><ul><li>More bits manipulated faster! </li></ul>
  22. 22. Instruction-Level Parallelism <ul><li>Limits to bit-level parallelism </li></ul><ul><ul><li>Numbers are big enough </li></ul></ul><ul><ul><li>Operations are fast </li></ul></ul><ul><li>Seek parallelism executing many instruction at once </li></ul><ul><li>Recall Fetch-Execute Loop { </li></ul><ul><ul><li>S1 : read “current” instruction from memory </li></ul></ul><ul><ul><li>S2 : decode instruction to see what is to be done </li></ul></ul><ul><ul><li>S3 : read instruction input(s) </li></ul></ul><ul><ul><li>S4 : perform instruction operation </li></ul></ul><ul><ul><li>S5 : write instruction output(s) </li></ul></ul><ul><ul><li>Also determine “next” instruction and make it “current” </li></ul></ul><ul><li>} </li></ul>
  23. 23. Instruction-Level Parallelism, cont. <ul><li>One-at-a-time instructions per cycle = 1/5 </li></ul><ul><ul><li>Time 01 02 03 04 05 06 07 08 09 10 </li></ul></ul><ul><ul><li>ADD S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>SUB .. .. .. .. .. S1 S2 S3 S4 S5 </li></ul></ul><ul><li>Pipelining instructions per cycle = 1 (or less) </li></ul><ul><ul><li>Time 01 02 03 04 05 06 07 08 09 10 </li></ul></ul><ul><ul><li>ADD S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>SUB .. S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>ORI .. .. S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>AND .. .. .. S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>MUL .. .. .. .. S1 S2 S3 S4 S5 </li></ul></ul>
  24. 24. Instruction-Level Parallelism, cont. <ul><li>4-way Superscalar instructions per cycle = 4 (or less) </li></ul><ul><ul><li>Time 01 02 03 04 05 06 07 08 09 10 </li></ul></ul><ul><ul><li>ADD S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>SUB S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>ORI S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>AND S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>MUL .. S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>SRL .. S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>XOR .. S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>LDW .. S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>STW .. .. S1 S2 S3 S4 S5 </li></ul></ul><ul><ul><li>DIV .. .. S1 S2 S3 S4 S5 </li></ul></ul>
  25. 25. Instruction-Level Parallelism, cont. <ul><li>Current processors have dozens of instructions executing </li></ul><ul><li>Must predict which instructions are next </li></ul><ul><li>Limits to control prediction? </li></ul><ul><li>Look elsewhere? (thread-level parallelism later) </li></ul><ul><li>Memory a serious problem </li></ul><ul><ul><li>1980: memory access time = one instruction time </li></ul></ul><ul><ul><li>2000: memory access time = 100 instruction times </li></ul></ul>
  26. 26. Caching & Memory Hierarchies <ul><li>Memory can be </li></ul><ul><ul><li>Fast </li></ul></ul><ul><ul><li>Vast </li></ul></ul><ul><ul><li>But not both </li></ul></ul><ul><li>Use two memories </li></ul><ul><ul><li>Cache: small, fast (e.g., 64,000 bytes in 1 ns) </li></ul></ul><ul><ul><li>Memory: large, vast (e.g., 64,000,000 bytes in 100 ns) </li></ul></ul><ul><li>Use prediction to fill cache </li></ul><ul><ul><li>Likely to re-reference information </li></ul></ul><ul><ul><li>Likely to reference nearby information </li></ul></ul><ul><ul><li>E.g., address book cache of phone directory </li></ul></ul>
  27. 27. Caching & Memory Hierarchies, cont. <ul><li>Cache + Memory makes memory look fast & vast </li></ul><ul><ul><li>If cache has information on 99% of accesses </li></ul></ul><ul><ul><li>1 ns + 1% * 100 ns = 2 ns </li></ul></ul><ul><ul><li>E.g. P3 (w/o L2 cache) </li></ul></ul><ul><li>Caching Applied Recursively </li></ul><ul><ul><li>Registers </li></ul></ul><ul><ul><li>Level-one cache </li></ul></ul><ul><ul><li>Level-two cache </li></ul></ul><ul><ul><li>Memory </li></ul></ul><ul><ul><li>Disk </li></ul></ul><ul><ul><li>(File Server) </li></ul></ul><ul><ul><li>(Proxy Cache) </li></ul></ul>
  28. 28. Cost Side of Moore’s Law <ul><li>About every two years: same computing at half cost </li></ul><ul><li>Long-term effect: </li></ul><ul><ul><li>1940s Prototypes for calculating ballistic trajectories </li></ul></ul><ul><ul><li>1950s Early mainframes for large banks </li></ul></ul><ul><ul><li>1960s Mainframes flourish in many large businesses </li></ul></ul><ul><ul><li>1970s Minicomputers for business, science, & engineering </li></ul></ul><ul><ul><li>Early 1980s PCs for word processing & spreadsheets </li></ul></ul><ul><ul><li>Late 1980s PCs for desktop publishing </li></ul></ul><ul><ul><li>1990s PCs for games, multimedia, e-mail, & web </li></ul></ul><ul><li>Jim Gray: In ten years you can buy a computer for the cost of its sales tax today (assuming 3% or more) </li></ul>
  29. 29. Outline <ul><li>Computer Primer </li></ul><ul><li>Technology Primer </li></ul><ul><li>Harnessing Moore’s Law </li></ul><ul><li>Future Trends </li></ul><ul><ul><li>Moore’s Law </li></ul></ul><ul><ul><li>Harnessing Moore’s Law </li></ul></ul><ul><ul><li>Computer uses </li></ul></ul><ul><ul><li>Some Non-Technical Implications </li></ul></ul>
  30. 30. Revolutions <ul><li>Industrial Revolution enabled by machines </li></ul><ul><ul><li>Interchangeable parts </li></ul></ul><ul><ul><li>Mass production </li></ul></ul><ul><ul><li>Lower costs  expanded application </li></ul></ul><ul><li>Information Revolution enabled by machines </li></ul><ul><ul><li>Interchangeable purpose (software) </li></ul></ul><ul><ul><li>Mass production (chips = integrated circuits) </li></ul></ul><ul><ul><li>Lower costs  expanded application </li></ul></ul>
  31. 31. Future of Moore’s Law <ul><li>Short-Term (1-5 years) </li></ul><ul><ul><li>Will operate (due to prototypes in lab) </li></ul></ul><ul><ul><li>Fabrication cost will go up rapidly </li></ul></ul><ul><li>Medium-Term (5-15 years) </li></ul><ul><ul><li>Exponential growth rate will likely slow </li></ul></ul><ul><ul><li>Trillion-dollar industry is motivated </li></ul></ul><ul><li>Long-Term (>15 years) </li></ul><ul><ul><li>May need new technology (chemical or quantum) </li></ul></ul><ul><ul><li>We can do better (e.g., human brain) </li></ul></ul><ul><ul><li>I would not close the patent office </li></ul></ul>
  32. 32. Future of Harnessing Moore’s Law <ul><li>Thread-Level Parallelism </li></ul><ul><ul><li>Multiple processors cooperating (exists today) </li></ul></ul><ul><ul><li>More common in future with multiple processors per chip </li></ul></ul><ul><ul><li>Parallelism in Internet? The Grid. </li></ul></ul><ul><li>System on a Chip </li></ul><ul><ul><li>Processor, memory, and I/O on one chip </li></ul></ul><ul><ul><li>Cost-performance leap like microprocessor? </li></ul></ul><ul><ul><li>(e.g., accelerometer at right) </li></ul></ul><ul><li>Communication </li></ul><ul><ul><li>World-wide web & wireless cell phone fuse! </li></ul></ul><ul><li>Other properties: robust & easy to design & use </li></ul>
  33. 33. Future Computer Uses <ul><li>Computer cost-effectiveness determines application viability </li></ul><ul><ul><li>Spreadsheets on a US$2M mainframe do not make sense </li></ul></ul><ul><ul><li>A 10x cost-performance change enables new possibilities [Joy] </li></ul></ul><ul><li>Most computers will NOT be computers </li></ul><ul><ul><li>How many electric motors do you have in your home? </li></ul></ul><ul><ul><li>How many did you buy as electric motors? </li></ul></ul><ul><ul><li>I control several computers, but most computers I control are embedded in cars, remote controls, refrigerators, etc. </li></ul></ul><ul><li>Two Stories </li></ul><ul><ul><li>Danny Hillis’s doorknobs </li></ul></ul><ul><ul><li>William Wulf’s “powerful” computer </li></ul></ul>
  34. 34. Future Computer Uses, cont. <ul><li>Technologists have always been poor predictors for future use </li></ul><ul><ul><li>Edison invented the motion picture machine </li></ul></ul><ul><ul><li>Hollywood invented movies </li></ul></ul><ul><li>To Predict: </li></ul><ul><ul><li>What would you want if it was 10 times cheaper? </li></ul></ul><ul><ul><li>What can be 10 time cheaper if you make more? </li></ul></ul><ul><ul><li>Better yet, ask a ten year old! </li></ul></ul><ul><li>What do you think? </li></ul>
  35. 35. Some Non-Technical Thoughts <ul><li>We make over a billion transistors/second </li></ul><ul><ul><li>One transistor per man/woman/child in < 10 seconds (humankind has made many more transistors than bricks!) </li></ul></ul><ul><ul><li>But those transistors are not being distributed equally </li></ul></ul><ul><li>Computers can be incredibly effectively tools </li></ul><ul><ul><li>Knowledge workers in medicine, law, & engineering </li></ul></ul><ul><ul><li>But not unskilled laborers! </li></ul></ul><ul><li>Computer use will exacerbate the social gradient </li></ul><ul><li>As citizens, we should ask </li></ul><ul><ul><li>Can/should we ameliorate this trend? </li></ul></ul><ul><ul><li>If so how? </li></ul></ul>
  36. 36. Summary <ul><li>Computers are machines for purposes “to be determined” </li></ul><ul><li>Vast cost reductions have enabled new uses </li></ul><ul><ul><li>Software flexibility </li></ul></ul><ul><ul><li>Moore’s Law and its harnessing </li></ul></ul><ul><li>Technology should be our tool, not our master </li></ul><ul><ul><li>Many benefits </li></ul></ul><ul><ul><li>Some costs </li></ul></ul>

×