Isca needle a_0610

884
-1

Published on

Moving the Needle: Computer Architecture Research in Academe and Industry
By Bill Dally

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
884
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Isca needle a_0610

  1. 1. Moving the NeedleComputer Architecture Research in Academe and Industry<br />Bill Dally<br />Chief Scientist & Sr. VP of Research, NVIDIA<br />Bell Professor of Engineering, Stanford University<br />
  2. 2. Outline<br />The Research Funnel<br />Most ideas fail<br />Those that succeed take 5-10 years <br />The Research Formula<br />Constraints<br />The Academic Advantage<br />The Industrial Advantage<br />Startups<br />Best practices<br />
  3. 3. Goal – Positive Impact on a Product<br />
  4. 4. The Research Funnel<br />Technology<br />insight<br />Concept<br />Dev<br />Model<br />Eval<br />Dev<br />Applications<br />
  5. 5. Most ideas fail<br />The ideas that succeed take a long time<br />Concept<br />Dev<br />Model<br />Eval<br />Dev<br />
  6. 6. Most ideas fail<br />The ideas that succeed take a long time<br />Concept<br />Dev<br />Model<br />Eval<br />Dev<br />
  7. 7. Most ideas fail<br />So terminate the bad ones quickly<br />
  8. 8. Most ideas fail<br />So terminate the bad ones quickly<br />Be a terminator, not an advocate<br />
  9. 9. Dally, “Micro-Optimization of Floating-Point Operations, ASPLOS, 1989, pp 283-289<br />
  10. 10.
  11. 11. Most ideas fail<br />The ideas that succeed take a long time<br />Concept<br />Dev<br />Model<br />Eval<br />Dev<br />
  12. 12. The ideas that succeed take a long time<br />So aim research 5-10 years ahead of current practice<br />
  13. 13. Current Architecture Practice<br />
  14. 14.
  15. 15. Aim Here<br />5-10 years<br />
  16. 16. Enable this point<br />5-10 years<br />
  17. 17. Timeline for some ideas<br />
  18. 18. The Performance Equation<br />
  19. 19. The Research Formula<br />
  20. 20. Reward<br />If you are wildly successful, what difference will it make?<br />
  21. 21. Effort<br />Learn as much as possible with as little work as possible<br />
  22. 22. Effort<br />Do the minimum analysis and experimentation necessary to make a point<br />
  23. 23. Real and Artificial Constraints<br />
  24. 24. Constraining Infrastructure<br />Benchmarks<br />Binaries<br />Compiler<br />Simulator<br />uArch Idea<br />ISA<br />Other<br />uArch<br />
  25. 25. Constraining Infrastructure<br />Benchmarks<br />Binaries<br />Compiler<br />Simulator<br />uArch Idea<br />ISA<br />Other<br />uArch<br />
  26. 26. Constraining Infrastructure<br />Benchmarks<br />Binaries<br />Compiler<br />Simulator<br />uArch Idea<br />ISA<br />Other<br />uArch<br />
  27. 27. The contribution is insight<br />Not novelty<br />Not numbers<br />
  28. 28. Research is a <br />hunt for insight<br />Need to get off the beaten path to find new insights<br />
  29. 29. Road-Kill Research<br />Benchmarks<br />Binaries<br />Compiler<br />Simulator<br />uArch Idea<br />ISA<br />Other<br />uArch<br />
  30. 30.
  31. 31. Looking here for lost keys<br />
  32. 32. Lost keys here<br />Looking here<br />
  33. 33. The Academic Advantage<br />
  34. 34. The Academic Advantage<br /> Freedom<br />
  35. 35. The Academic Advantage<br /> Freedom from artificial constraints<br /> Freedom to fail (take risks)<br />
  36. 36. Academic research matched for early stages of the funnel<br />Concept<br />Dev<br />Model<br />Eval<br />Dev<br />
  37. 37. Example: ELM<br />An Ensemble<br />Many Ensembles and memory tiles on a die<br />37<br />
  38. 38. Example: ELM<br />Balfour et al., "An Energy-Efficient Processor Architecture for Embedded Systems" CAL, Jan. 2008, pp 29-32.<br />
  39. 39. ELM Infrastructure<br />Benchmarks<br />Binaries<br />Compiler<br />Simulator<br />uArch Idea<br />ISA<br />Other<br />uArch<br />Changed for ELM<br />
  40. 40. The Industrial Advantage<br /> Resources and Experience<br />
  41. 41. The Industrial Advantage<br />Resources to carry out detailed studies<br />Experience to address commercial constraints<br />
  42. 42. The ideal partnership:<br />Academic research 5-10 years out, focused on industry problems<br />Transfer insight to industrial research to refine into product<br />Concept<br />Dev<br />Model<br />Eval<br />Dev<br />
  43. 43. What transfers is insight<br />Not academic design<br />Not performance numbers<br />
  44. 44. What transfers is insight<br />And its transferred by people<br />Not papers<br />
  45. 45. Academic<br />Concept<br />Analysis<br />Simulation<br />Prototype<br />Refine Concept<br />Detailed Design<br />Industrial<br />
  46. 46. Industrial<br />Academic<br />Gap<br />Concept<br />Analysis<br />Simulation<br />Prototype<br />Refine Concept<br />Detailed Design<br />Impact<br />Paper<br />
  47. 47. Example: Cray T3D and T3E<br />
  48. 48. J-Machine<br /><ul><li>MIT 1987-1992
  49. 49. 3-D network
  50. 50. Global address space
  51. 51. Fast messaging and synchronization
  52. 52. Support for many models of computation</li></li></ul><li>Cray T3D<br /><ul><li>Started working with Cray in 1989
  53. 53. Project started early 1990
  54. 54. First ship in mid 1992
  55. 55. From J-Machine
  56. 56. Network
  57. 57. Fast communication/sync
  58. 58. Global address space
  59. 59. For reality
  60. 60. Alpha processors
  61. 61. MECL gate arrays
  62. 62. Robust software stack</li></li></ul><li>Best Practices for Academics<br /><ul><li>Long-term perspective (5-10 years)
  63. 63. Know your customer and their long-term issues
  64. 64. Look at tomorrow’s applications, not yesterdays
  65. 65. Maximize reward, minimize effort
  66. 66. Estimate maximum impact – terminate…
  67. 67. Minimal analysis and experiment to make the point
  68. 68. Exploit your freedom
  69. 69. Don’t be limited by exiting tools, benchmarks, ISAs, …
  70. 70. Carry result to impact
  71. 71. Build relationships with industry</li></ul>Benchmarks<br />Binaries<br />Compiler<br />Simulator<br />uArch Idea<br />ISA<br />Other<br />uArch<br />
  72. 72. Best Practices for Industry<br /><ul><li>Leverage academic research
  73. 73. Build partnerships
  74. 74. Articulate long-term research issues
  75. 75. Be open-minded
  76. 76. Minimize artificial constraints
  77. 77. Carry concepts across “the gap”
  78. 78. Open infrastructure</li></li></ul><li>A Partnership<br />Filtered, De-risked Concepts<br />Academe<br />Industry<br />Future issues<br />Infrastructure<br />
  79. 79. The Startup Path<br />When you can’t find an appropriate industrial partner, make one.<br />STAC, Avici, Velio, SPI<br />
  80. 80. Academic<br />Concept<br />Analysis<br />Simulation<br />Prototype<br />Refine Concept<br />Detailed Design<br />Startup<br />
  81. 81. Startup Pros/Cons<br />Pros<br /><ul><li>Don’t have to convince existing company to change course (until exit)</li></ul>Cons<br /><ul><li>Have to convince investors (repeatedly)
  82. 82. Have to build a whole company, not just a development team
  83. 83. Finance, sales, marketing, …
  84. 84. Limited resources
  85. 85. Impatient capital</li></li></ul><li>Example: SPI<br />
  86. 86. Much easier to license technology to an existing company<br />
  87. 87. Starting a company to bring a new semiconductor product to market costs $30M (to cash flow positive)<br />If it’s a programmable processor, its $70M<br />Investors want a 10x ROI<br />Need to see a $700M exit to justify a new processor company<br />
  88. 88. The future of computer architecture<br />
  89. 89. The future of computer architecture<br /><ul><li>NOW is an ideal time for research to move the needle
  90. 90. Computers are drastically changing
  91. 91. Pervasive parallelism
  92. 92. Energy limited
  93. 93. Bandwidth constrained
  94. 94. Opportunity to set the MSB of future computers in the next few years
  95. 95. Requires changing the whole stack
  96. 96. Requires industry-academe partnership</li></li></ul><li>Energy-Efficient ArchitectureAbstracting Locality<br />20mm<br />L3<br />7pJ<br />2000pJ<br />Net<br />50pJ<br />500pJ<br />L2<br />Net<br />2000pJ<br />L1<br />L1<br />L1<br />L1<br />P<br />P<br />P<br />P<br />
  97. 97. Solution involves many levels of the “stack”<br />Application<br />Algorithm<br />Prog. System<br />Compiler<br />ISA<br />uArch<br />Too constrained to innovate within one layer<br />Design<br />Circuits<br />Process<br />
  98. 98. Benchmarks<br />Binaries<br />Compiler<br />Academe<br />Industry<br />Simulator<br />uArch Idea<br />ISA<br />Other<br />uArch<br />
  99. 99. Moving the NeedleComputer Architecture Research in Academe and Industry<br />Bill Dally<br />Chief Scientist & Sr. VP of Research, NVIDIA<br />Bell Professor of Engineering, Stanford University<br />

×