Hardware/Software Co-design:
The Coming Golden Age
Bryan Cantrill
Oxide Computer Company
The hardware/software divide
● The shift to public cloud computing over the last fifteen years has
allowed software and hardware to become disconnected
● On the one hand, this can be empowering: a SaaS offering can be built
with no real understanding of the hardware beneath it
● But there’s a risk of taking software-centric thinking too far -- and
drawing the mistaken conclusion that hardware is irrelevant (or worse)
● This overshot in thinking is epitomized by Marc Andreessen’s 2011
essay, “Why Software is Eating the World”
Revisiting Andreessen
● Certainly, the essay makes an important observation on the importance
of software in essentially every domain:
Revisiting Andreessen
● And the effect of Moore’s Law + open source + public cloud computing
has indisputably lowered the cost of delivering software:
Revisiting Andreessen
● But the essay errs in fetishizing software, mistakenly viewing extant
industries as likely to be disrupted by SaaS alone:
Revisiting Andreessen
● Software is important -- but the essay conflates software companies
with companies that in fact integrate software and hardware
● Companies that Andreessen cited that have thrived -- Amazon, Google,
etc. -- have very significant hardware components!
● Many software-only companies that are cited have disappointed:
Zynga, Rovio, Groupon, LivingSocial, Foursquare
● Andreessen is dismissive of Apple (up 15X) -- and entirely ignores
companies like NVIDIA (57X), AMD (14X), or even Intel (3X)!
Revisiting Andreessen
Revisiting another famous essay
Gordon Moore, ca. 1965
Gordon Moore, ca. 1965
Gordon Moore, ca. 1965
Gordon Moore, ca. 1965
Moore’s Law?
Gordon Moore, ca. 1965
Moore’s Law!
Gordon Moore, ca. 1965
Moore’s Law?!
So… Moore’s Law?
● In his 1965 paper, there is no Moore’s Law per se — just a bunch of
incredibly astute and prescient observations
● The term “Moore’s Law” would be coined by Carver Mead in 1971 as
part of his work on determining ultimate physical limits
● Moore updated the law in 1975 to be a doubling of transistor density
every two years (Denard scaling would be outlined in detail in 1974)
● For many years, Moore’s Law could be inferred to be doublings of
transistor density, speed, and economics
Moore’s Law: Good old days?
● The 1980s and early 1990s were great for Moore’s Law — so much so
that computers needed a “turbo button” to counteract its effects (!!)
● But even in those halcyon years, Moore’s Law was leaving DRAM
behind: memory was becoming denser but no faster
● An increasing number of workloads began hitting the memory wall
● Caching was necessary but insufficient...
Moore’s Law: The memory wall
● By the mid-1990s, it had become clear that symmetric multiprocessing
was the path to deliver throughput on multi-threaded workloads
● ...but SMP did nothing for single-threaded performance
● Deep pipelining and VLIW were — largely — failed experiments
● For single-threaded workloads, microprocessors turned to out-of-order
and speculative execution to hide memory latency
● Even in simpler times, scaling with Moore’s Law was a challenge!
Moore’s Law: Architectural shifts
● Denard scaling ended in ~2006 due to current leakage…
● ...but by then chip multiprocessing was clearly the trajectory
● CMP was enhanced by simultaneous multithreading (SMT), which
offered up to another factor of two on throughput
● Thanks to the earlier software work on SMP, CMP/SMT was less of a
software performance apocalypse than some feared — but more of a
security apocalypse than anyone anticipated!
● And “dark silicon” greatly limits CMP!
Moore’s Law: Deceleration
● In August 2018, GlobalFoundries suddenly stopped 7nm development,
citing economics -- it was simply too expensive to stay competitive
● GlobalFoundries’ departure left TSMC and Samsung on 7nm -- and Intel
on 14nm, struggling to get to 10nm
● Intel’s Cannon Lake was three years late and an unmitigated disaster --
and for Ice Lake/Cascade Lake, Intel is intermixing 14nm and 10nm
● Moving to 3nm/5nm requires moving beyond FinFETs to GAAFETs --
and to EUV photolithography; new nodes are very expensive!
Aside: Process nodes
● You may well wonder: when a process node is “7nm” or “5nm”, what
exactly is seven nanometers or five nanometers long? (And, um, how big
is a silicon atom anyway?)
● Answer to the second question: ~210 picometers!
● Answer to the first question: nothing! Unbelievably, the name of the
process node no longer measures anything at all (!!) -- it is merely a
rough expression of transistor density (and implication of process)
● E.g. 7nm ≈ 100MTr/mm2
(but there are lots of caveats)
Moore’s Law
● Increased transistor density is continuing to be possible, but at a greatly
slowed pace -- and at outsized cost
● Economically, Moore’s Law is indisputably ending
● But is there another way of looking at it?
Another essay, further back in time...
Theodore Wright, ca. 1936
Wright’s Law
● In 1936, Theodore Wright studied the costs of aircraft manufacturing,
finding that the cost dropped with experience
● Over time, when volume doubled, unit costs dropped by 10-15%
● This phenomenon has been observed in other technological domains
● In 2013, Jessika Trancik et al. found Wright’s Law to hold better
predictive power for transistor cost than Moore’s Law!
● Wright’s Law seems to hold, especially for older process nodes
Wright on market creation
Wright foreshadowing Moore
One final essay...
W. Stanley Jevons, ca. 1865
W. Stanley Jevons, ca. 1865
Jevons foreshadowing Wright
Aside: Never say “never”
Aside: A contemporary weighs in on Jevons?
Back to computing!
● Andreessen’s 2011 piece, while containing some truisms, is overly
software-centric and misses hardware’s role entirely
● Moore’s Law -- while prescient! -- is indisputably slowing
● Wright’s Law, however, may still be holding for transistors -- especially
at older processing nodes (22nm, 40nm, 90nm, etc.)
● The Jevons Paradox has proven again and again to apply to computing:
when general purpose computation is cheaper, we find more to do
● We can expect more computation in more places
Compute everywhere?
● More computation doesn’t just mean computers in new places (à la IoT),
it means CPUs present where we once thought of components
● E.g., open 32-bit CPUs replacing hidden, closed 8-bit microcontrollers
● We are already seeing CPUs on the NIC (SmartNIC), CPUs next to flash
(e.g., open-channel SSD) and on the spindle (e.g. WD’s SweRV)
● New opportunities for hardware/software co-design: keep hardware
simple and put more sophistication into software and/or soft logic
● There are several trends acting as accelerants for this shift...
Open instruction sets
● X86 and ARM -- the two market victors -- are both encumbered by
history and licensing
● RISC-V is an attempt to learn from the ISA mistakes of the past, in a
vessel that is entirely open and -- with open implementations
● RISC-V is very promising, but there remain many gaps to close
● To succeed, RISC-V must focus as much on the SoC as the ISA -- while
remaining entirely open!
Open FPGAs
● FPGA bitstreams have historically been entirely proprietary -- and one
is therefore dependent upon proprietary tools to generate them
● The Lattice iCE40 bitstream format was reverse engineered in 2015 by
Claire Wolf, and can be entirely synthesized with an open toolchain!
● While Xilinx (AMD) and Alterra (Intel) retain proprietary components
(e.g., for timing models), newcomers like QuickLogic are entirely open
● See, e.g., SymbiFlow, Verilog to Routing (VTR), Yosys, OpenFPGA, and
the (new!) Open Source FPGA Foundation
Open HDLs
● Hardware description languages have traditionally been dominated by
Verilog and (later) SystemVerilog
● Compilers have been historically proprietary -- and the languages
themselves are error prone
● In recent years we have seen a wave of new, open HDLs, e.g.: Chisel,
nMigen, Bluespec, SpinalHDL, Mamba (PyMTL 3), HardCaml
● Of these, one is particularly noteworthy...
Open HDL: Bluespec
● Bluespec is a high-level HDL that takes its inspiration from formal
specification languages -- and strongly typed languages like Haskell
● Bluespec uses the expressiveness of the language to move away from
individual signals -- and to atomic rules and interfaces
● This allows for the compiler to do the hard work of connecting modules
and proving correctness, greatly reducing verification time!
● In the words of Oxide engineer Arjen Roodselaar, “Bluespec is to
SystemVerilog what Rust is to assembly”
Open HDL: Bluespec
● Bluespec was proprietary for 20 years; open sourced in early 2020!
● We at Oxide feel that Bluespec is a profoundly transformative
technology -- but not one that is broadly understood or appreciated!
● More details:
○ https://github.com/B-Lang-org/Documentation
○ https://github.com/B-Lang-org/bsc
○ https://github.com/oxidecomputer/cobalt
Open source EDA
● Proprietary software has historically dominated EDA…
● Open source alternatives have existed for years -- but one in particular,
KiCad, has enjoyed sufficiently broad sponsorship to close the gaps with
professional-grade software
● The maturity of KiCad coupled with the rise of quick turn PCB
manufacturing/assembly has allowed for astonishing speed:
○ From conception to manufacturer in hours
○ From manufacturer to shipping board in days
Board economics
● Single board computers are very accessible!
○ An STM32 Nucleo-144 board with 400 MHz Cortex M7 CPU + 2
MB of flash + 1 MB of RAM + all I/O peripherals for less than $30
○ A BeagleBone Black -- with 1 GHz Cortex A8 CPU + 4 GB of flash +
512 MB DDR3 + HDMI for less than $60!
● All documentation available online and without NDA -- and the
BeagleBone Black is (nearly) entirely open
● The BeagleBone Black can also be used as a logic analyzer via sigrok
Open source firmware
● The software that runs closest to the hardware is increasingly open,
with drivers nearly (nearly!) always open
● Increasingly, we are seeing the firmware of unseen parts of the system
become open as well, viz. the Open Source Firmware Conference
● This trend is slower in the 7nm SoCs -- but it’s happening!
● However, even in putatively open architectures, there generally still
remains proprietary software in the form of boot ROMs -- and this
proprietary software remains a problem!
Embedded Rust
● Rust has proven to be a revolution for systems software: rich type
system, algebraic types, ownership model allow for fast, correct code
● Slightly more surprising has been Rust’s ability to get small -- which
coupled with its lack of a runtime lets it fit everywhere!
● With its safety and expressive power, Rust represents a quantum leap
over C -- and without losing performance or sacrificing size
● We at Oxide are working on a de novo Rust operating system for the
embedded use case that we will (naturally?) open source; stay tuned!
To sum...
“That changed everything”
A new Golden Age!
● Thanks to Moore’s Law, Wright’s Law and the rise of open source, it is
easier to build hardware than ever before!
● We are going to see computers in many more places, posing challenges
to us all to develop reliable, secure, high performing systems
● Software remains essential, but we must not think of it in isolation; we
must co-design the hardware and the software in our systems!
● The systems are open, the communities are welcoming! Let’s build!

Hardware/software Co-design: The Coming Golden Age

  • 1.
    Hardware/Software Co-design: The ComingGolden Age Bryan Cantrill Oxide Computer Company
  • 2.
    The hardware/software divide ●The shift to public cloud computing over the last fifteen years has allowed software and hardware to become disconnected ● On the one hand, this can be empowering: a SaaS offering can be built with no real understanding of the hardware beneath it ● But there’s a risk of taking software-centric thinking too far -- and drawing the mistaken conclusion that hardware is irrelevant (or worse) ● This overshot in thinking is epitomized by Marc Andreessen’s 2011 essay, “Why Software is Eating the World”
  • 3.
    Revisiting Andreessen ● Certainly,the essay makes an important observation on the importance of software in essentially every domain:
  • 4.
    Revisiting Andreessen ● Andthe effect of Moore’s Law + open source + public cloud computing has indisputably lowered the cost of delivering software:
  • 5.
    Revisiting Andreessen ● Butthe essay errs in fetishizing software, mistakenly viewing extant industries as likely to be disrupted by SaaS alone:
  • 6.
    Revisiting Andreessen ● Softwareis important -- but the essay conflates software companies with companies that in fact integrate software and hardware ● Companies that Andreessen cited that have thrived -- Amazon, Google, etc. -- have very significant hardware components! ● Many software-only companies that are cited have disappointed: Zynga, Rovio, Groupon, LivingSocial, Foursquare ● Andreessen is dismissive of Apple (up 15X) -- and entirely ignores companies like NVIDIA (57X), AMD (14X), or even Intel (3X)!
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
    So… Moore’s Law? ●In his 1965 paper, there is no Moore’s Law per se — just a bunch of incredibly astute and prescient observations ● The term “Moore’s Law” would be coined by Carver Mead in 1971 as part of his work on determining ultimate physical limits ● Moore updated the law in 1975 to be a doubling of transistor density every two years (Denard scaling would be outlined in detail in 1974) ● For many years, Moore’s Law could be inferred to be doublings of transistor density, speed, and economics
  • 19.
    Moore’s Law: Goodold days? ● The 1980s and early 1990s were great for Moore’s Law — so much so that computers needed a “turbo button” to counteract its effects (!!) ● But even in those halcyon years, Moore’s Law was leaving DRAM behind: memory was becoming denser but no faster ● An increasing number of workloads began hitting the memory wall ● Caching was necessary but insufficient...
  • 20.
    Moore’s Law: Thememory wall ● By the mid-1990s, it had become clear that symmetric multiprocessing was the path to deliver throughput on multi-threaded workloads ● ...but SMP did nothing for single-threaded performance ● Deep pipelining and VLIW were — largely — failed experiments ● For single-threaded workloads, microprocessors turned to out-of-order and speculative execution to hide memory latency ● Even in simpler times, scaling with Moore’s Law was a challenge!
  • 21.
    Moore’s Law: Architecturalshifts ● Denard scaling ended in ~2006 due to current leakage… ● ...but by then chip multiprocessing was clearly the trajectory ● CMP was enhanced by simultaneous multithreading (SMT), which offered up to another factor of two on throughput ● Thanks to the earlier software work on SMP, CMP/SMT was less of a software performance apocalypse than some feared — but more of a security apocalypse than anyone anticipated! ● And “dark silicon” greatly limits CMP!
  • 22.
    Moore’s Law: Deceleration ●In August 2018, GlobalFoundries suddenly stopped 7nm development, citing economics -- it was simply too expensive to stay competitive ● GlobalFoundries’ departure left TSMC and Samsung on 7nm -- and Intel on 14nm, struggling to get to 10nm ● Intel’s Cannon Lake was three years late and an unmitigated disaster -- and for Ice Lake/Cascade Lake, Intel is intermixing 14nm and 10nm ● Moving to 3nm/5nm requires moving beyond FinFETs to GAAFETs -- and to EUV photolithography; new nodes are very expensive!
  • 23.
    Aside: Process nodes ●You may well wonder: when a process node is “7nm” or “5nm”, what exactly is seven nanometers or five nanometers long? (And, um, how big is a silicon atom anyway?) ● Answer to the second question: ~210 picometers! ● Answer to the first question: nothing! Unbelievably, the name of the process node no longer measures anything at all (!!) -- it is merely a rough expression of transistor density (and implication of process) ● E.g. 7nm ≈ 100MTr/mm2 (but there are lots of caveats)
  • 24.
    Moore’s Law ● Increasedtransistor density is continuing to be possible, but at a greatly slowed pace -- and at outsized cost ● Economically, Moore’s Law is indisputably ending ● But is there another way of looking at it?
  • 25.
    Another essay, furtherback in time...
  • 26.
  • 27.
    Wright’s Law ● In1936, Theodore Wright studied the costs of aircraft manufacturing, finding that the cost dropped with experience ● Over time, when volume doubled, unit costs dropped by 10-15% ● This phenomenon has been observed in other technological domains ● In 2013, Jessika Trancik et al. found Wright’s Law to hold better predictive power for transistor cost than Moore’s Law! ● Wright’s Law seems to hold, especially for older process nodes
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
    Aside: Never say“never”
  • 35.
    Aside: A contemporaryweighs in on Jevons?
  • 36.
    Back to computing! ●Andreessen’s 2011 piece, while containing some truisms, is overly software-centric and misses hardware’s role entirely ● Moore’s Law -- while prescient! -- is indisputably slowing ● Wright’s Law, however, may still be holding for transistors -- especially at older processing nodes (22nm, 40nm, 90nm, etc.) ● The Jevons Paradox has proven again and again to apply to computing: when general purpose computation is cheaper, we find more to do ● We can expect more computation in more places
  • 37.
    Compute everywhere? ● Morecomputation doesn’t just mean computers in new places (à la IoT), it means CPUs present where we once thought of components ● E.g., open 32-bit CPUs replacing hidden, closed 8-bit microcontrollers ● We are already seeing CPUs on the NIC (SmartNIC), CPUs next to flash (e.g., open-channel SSD) and on the spindle (e.g. WD’s SweRV) ● New opportunities for hardware/software co-design: keep hardware simple and put more sophistication into software and/or soft logic ● There are several trends acting as accelerants for this shift...
  • 38.
    Open instruction sets ●X86 and ARM -- the two market victors -- are both encumbered by history and licensing ● RISC-V is an attempt to learn from the ISA mistakes of the past, in a vessel that is entirely open and -- with open implementations ● RISC-V is very promising, but there remain many gaps to close ● To succeed, RISC-V must focus as much on the SoC as the ISA -- while remaining entirely open!
  • 39.
    Open FPGAs ● FPGAbitstreams have historically been entirely proprietary -- and one is therefore dependent upon proprietary tools to generate them ● The Lattice iCE40 bitstream format was reverse engineered in 2015 by Claire Wolf, and can be entirely synthesized with an open toolchain! ● While Xilinx (AMD) and Alterra (Intel) retain proprietary components (e.g., for timing models), newcomers like QuickLogic are entirely open ● See, e.g., SymbiFlow, Verilog to Routing (VTR), Yosys, OpenFPGA, and the (new!) Open Source FPGA Foundation
  • 40.
    Open HDLs ● Hardwaredescription languages have traditionally been dominated by Verilog and (later) SystemVerilog ● Compilers have been historically proprietary -- and the languages themselves are error prone ● In recent years we have seen a wave of new, open HDLs, e.g.: Chisel, nMigen, Bluespec, SpinalHDL, Mamba (PyMTL 3), HardCaml ● Of these, one is particularly noteworthy...
  • 41.
    Open HDL: Bluespec ●Bluespec is a high-level HDL that takes its inspiration from formal specification languages -- and strongly typed languages like Haskell ● Bluespec uses the expressiveness of the language to move away from individual signals -- and to atomic rules and interfaces ● This allows for the compiler to do the hard work of connecting modules and proving correctness, greatly reducing verification time! ● In the words of Oxide engineer Arjen Roodselaar, “Bluespec is to SystemVerilog what Rust is to assembly”
  • 42.
    Open HDL: Bluespec ●Bluespec was proprietary for 20 years; open sourced in early 2020! ● We at Oxide feel that Bluespec is a profoundly transformative technology -- but not one that is broadly understood or appreciated! ● More details: ○ https://github.com/B-Lang-org/Documentation ○ https://github.com/B-Lang-org/bsc ○ https://github.com/oxidecomputer/cobalt
  • 43.
    Open source EDA ●Proprietary software has historically dominated EDA… ● Open source alternatives have existed for years -- but one in particular, KiCad, has enjoyed sufficiently broad sponsorship to close the gaps with professional-grade software ● The maturity of KiCad coupled with the rise of quick turn PCB manufacturing/assembly has allowed for astonishing speed: ○ From conception to manufacturer in hours ○ From manufacturer to shipping board in days
  • 44.
    Board economics ● Singleboard computers are very accessible! ○ An STM32 Nucleo-144 board with 400 MHz Cortex M7 CPU + 2 MB of flash + 1 MB of RAM + all I/O peripherals for less than $30 ○ A BeagleBone Black -- with 1 GHz Cortex A8 CPU + 4 GB of flash + 512 MB DDR3 + HDMI for less than $60! ● All documentation available online and without NDA -- and the BeagleBone Black is (nearly) entirely open ● The BeagleBone Black can also be used as a logic analyzer via sigrok
  • 45.
    Open source firmware ●The software that runs closest to the hardware is increasingly open, with drivers nearly (nearly!) always open ● Increasingly, we are seeing the firmware of unseen parts of the system become open as well, viz. the Open Source Firmware Conference ● This trend is slower in the 7nm SoCs -- but it’s happening! ● However, even in putatively open architectures, there generally still remains proprietary software in the form of boot ROMs -- and this proprietary software remains a problem!
  • 46.
    Embedded Rust ● Rusthas proven to be a revolution for systems software: rich type system, algebraic types, ownership model allow for fast, correct code ● Slightly more surprising has been Rust’s ability to get small -- which coupled with its lack of a runtime lets it fit everywhere! ● With its safety and expressive power, Rust represents a quantum leap over C -- and without losing performance or sacrificing size ● We at Oxide are working on a de novo Rust operating system for the embedded use case that we will (naturally?) open source; stay tuned!
  • 47.
  • 48.
  • 49.
    A new GoldenAge! ● Thanks to Moore’s Law, Wright’s Law and the rise of open source, it is easier to build hardware than ever before! ● We are going to see computers in many more places, posing challenges to us all to develop reliable, secure, high performing systems ● Software remains essential, but we must not think of it in isolation; we must co-design the hardware and the software in our systems! ● The systems are open, the communities are welcoming! Let’s build!