Hardware/software Co-design: The Coming Golden Age

Hardware/Software Co-design:
The Coming Golden Age
Bryan Cantrill
Oxide Computer Company

The hardware/software divide
● The shift to public cloud computing over the last ﬁfteen years has
allowed software and hardware to become disconnected
● On the one hand, this can be empowering: a SaaS offering can be built
with no real understanding of the hardware beneath it
● But there’s a risk of taking software-centric thinking too far -- and
drawing the mistaken conclusion that hardware is irrelevant (or worse)
● This overshot in thinking is epitomized by Marc Andreessen’s 2011
essay, “Why Software is Eating the World”

Revisiting Andreessen
● Certainly, the essay makes an important observation on the importance
of software in essentially every domain:

● And the effect of Moore’s Law + open source + public cloud computing
has indisputably lowered the cost of delivering software:

● But the essay errs in fetishizing software, mistakenly viewing extant
industries as likely to be disrupted by SaaS alone:

● Software is important -- but the essay conﬂates software companies
with companies that in fact integrate software and hardware
● Companies that Andreessen cited that have thrived -- Amazon, Google,
etc. -- have very signiﬁcant hardware components!
● Many software-only companies that are cited have disappointed:
Zynga, Rovio, Groupon, LivingSocial, Foursquare
● Andreessen is dismissive of Apple (up 15X) -- and entirely ignores
companies like NVIDIA (57X), AMD (14X), or even Intel (3X)!

Revisiting another famous essay

So… Moore’s Law?
● In his 1965 paper, there is no Moore’s Law per se — just a bunch of
incredibly astute and prescient observations
● The term “Moore’s Law” would be coined by Carver Mead in 1971 as
part of his work on determining ultimate physical limits
● Moore updated the law in 1975 to be a doubling of transistor density
every two years (Denard scaling would be outlined in detail in 1974)
● For many years, Moore’s Law could be inferred to be doublings of
transistor density, speed, and economics

Moore’s Law: Good old days?
● The 1980s and early 1990s were great for Moore’s Law — so much so
that computers needed a “turbo button” to counteract its effects (!!)
● But even in those halcyon years, Moore’s Law was leaving DRAM
behind: memory was becoming denser but no faster
● An increasing number of workloads began hitting the memory wall
● Caching was necessary but insufﬁcient...

Moore’s Law: The memory wall
● By the mid-1990s, it had become clear that symmetric multiprocessing
was the path to deliver throughput on multi-threaded workloads
● ...but SMP did nothing for single-threaded performance
● Deep pipelining and VLIW were — largely — failed experiments
● For single-threaded workloads, microprocessors turned to out-of-order
and speculative execution to hide memory latency
● Even in simpler times, scaling with Moore’s Law was a challenge!

Moore’s Law: Architectural shifts
● Denard scaling ended in ~2006 due to current leakage…
● ...but by then chip multiprocessing was clearly the trajectory
● CMP was enhanced by simultaneous multithreading (SMT), which
offered up to another factor of two on throughput
● Thanks to the earlier software work on SMP, CMP/SMT was less of a
software performance apocalypse than some feared — but more of a
security apocalypse than anyone anticipated!
● And “dark silicon” greatly limits CMP!

Moore’s Law: Deceleration
● In August 2018, GlobalFoundries suddenly stopped 7nm development,
citing economics -- it was simply too expensive to stay competitive
● GlobalFoundries’ departure left TSMC and Samsung on 7nm -- and Intel
on 14nm, struggling to get to 10nm
● Intel’s Cannon Lake was three years late and an unmitigated disaster --
and for Ice Lake/Cascade Lake, Intel is intermixing 14nm and 10nm
● Moving to 3nm/5nm requires moving beyond FinFETs to GAAFETs --
and to EUV photolithography; new nodes are very expensive!

Aside: Process nodes
● You may well wonder: when a process node is “7nm” or “5nm”, what
exactly is seven nanometers or ﬁve nanometers long? (And, um, how big
is a silicon atom anyway?)
● Answer to the second question: ~210 picometers!
● Answer to the ﬁrst question: nothing! Unbelievably, the name of the
process node no longer measures anything at all (!!) -- it is merely a
rough expression of transistor density (and implication of process)
● E.g. 7nm ≈ 100MTr/mm2
(but there are lots of caveats)

Moore’s Law
● Increased transistor density is continuing to be possible, but at a greatly
slowed pace -- and at outsized cost
● Economically, Moore’s Law is indisputably ending
● But is there another way of looking at it?

Another essay, further back in time...

Wright’s Law
● In 1936, Theodore Wright studied the costs of aircraft manufacturing,
ﬁnding that the cost dropped with experience
● Over time, when volume doubled, unit costs dropped by 10-15%
● This phenomenon has been observed in other technological domains
● In 2013, Jessika Trancik et al. found Wright’s Law to hold better
predictive power for transistor cost than Moore’s Law!
● Wright’s Law seems to hold, especially for older process nodes

Aside: A contemporary weighs in on Jevons?

Back to computing!
● Andreessen’s 2011 piece, while containing some truisms, is overly
software-centric and misses hardware’s role entirely
● Moore’s Law -- while prescient! -- is indisputably slowing
● Wright’s Law, however, may still be holding for transistors -- especially
at older processing nodes (22nm, 40nm, 90nm, etc.)
● The Jevons Paradox has proven again and again to apply to computing:
when general purpose computation is cheaper, we ﬁnd more to do
● We can expect more computation in more places

Compute everywhere?
● More computation doesn’t just mean computers in new places (à la IoT),
it means CPUs present where we once thought of components
● E.g., open 32-bit CPUs replacing hidden, closed 8-bit microcontrollers
● We are already seeing CPUs on the NIC (SmartNIC), CPUs next to ﬂash
(e.g., open-channel SSD) and on the spindle (e.g. WD’s SweRV)
● New opportunities for hardware/software co-design: keep hardware
simple and put more sophistication into software and/or soft logic
● There are several trends acting as accelerants for this shift...

Open instruction sets
● X86 and ARM -- the two market victors -- are both encumbered by
history and licensing
● RISC-V is an attempt to learn from the ISA mistakes of the past, in a
vessel that is entirely open and -- with open implementations
● RISC-V is very promising, but there remain many gaps to close
● To succeed, RISC-V must focus as much on the SoC as the ISA -- while
remaining entirely open!

Open FPGAs
● FPGA bitstreams have historically been entirely proprietary -- and one
is therefore dependent upon proprietary tools to generate them
● The Lattice iCE40 bitstream format was reverse engineered in 2015 by
Claire Wolf, and can be entirely synthesized with an open toolchain!
● While Xilinx (AMD) and Alterra (Intel) retain proprietary components
(e.g., for timing models), newcomers like QuickLogic are entirely open
● See, e.g., SymbiFlow, Verilog to Routing (VTR), Yosys, OpenFPGA, and
the (new!) Open Source FPGA Foundation

Open HDLs
● Hardware description languages have traditionally been dominated by
Verilog and (later) SystemVerilog
● Compilers have been historically proprietary -- and the languages
themselves are error prone
● In recent years we have seen a wave of new, open HDLs, e.g.: Chisel,
nMigen, Bluespec, SpinalHDL, Mamba (PyMTL 3), HardCaml
● Of these, one is particularly noteworthy...

Open HDL: Bluespec
● Bluespec is a high-level HDL that takes its inspiration from formal
speciﬁcation languages -- and strongly typed languages like Haskell
● Bluespec uses the expressiveness of the language to move away from
individual signals -- and to atomic rules and interfaces
● This allows for the compiler to do the hard work of connecting modules
and proving correctness, greatly reducing veriﬁcation time!
● In the words of Oxide engineer Arjen Roodselaar, “Bluespec is to
SystemVerilog what Rust is to assembly”

Open HDL: Bluespec
● Bluespec was proprietary for 20 years; open sourced in early 2020!
● We at Oxide feel that Bluespec is a profoundly transformative
technology -- but not one that is broadly understood or appreciated!
● More details:
○ https://github.com/B-Lang-org/Documentation
○ https://github.com/B-Lang-org/bsc
○ https://github.com/oxidecomputer/cobalt

Open source EDA
● Proprietary software has historically dominated EDA…
● Open source alternatives have existed for years -- but one in particular,
KiCad, has enjoyed sufﬁciently broad sponsorship to close the gaps with
professional-grade software
● The maturity of KiCad coupled with the rise of quick turn PCB
manufacturing/assembly has allowed for astonishing speed:
○ From conception to manufacturer in hours
○ From manufacturer to shipping board in days

Board economics
● Single board computers are very accessible!
○ An STM32 Nucleo-144 board with 400 MHz Cortex M7 CPU + 2
MB of ﬂash + 1 MB of RAM + all I/O peripherals for less than $30
○ A BeagleBone Black -- with 1 GHz Cortex A8 CPU + 4 GB of ﬂash +
512 MB DDR3 + HDMI for less than $60!
● All documentation available online and without NDA -- and the
BeagleBone Black is (nearly) entirely open
● The BeagleBone Black can also be used as a logic analyzer via sigrok

Open source ﬁrmware
● The software that runs closest to the hardware is increasingly open,
with drivers nearly (nearly!) always open
● Increasingly, we are seeing the ﬁrmware of unseen parts of the system
become open as well, viz. the Open Source Firmware Conference
● This trend is slower in the 7nm SoCs -- but it’s happening!
● However, even in putatively open architectures, there generally still
remains proprietary software in the form of boot ROMs -- and this
proprietary software remains a problem!

Embedded Rust
● Rust has proven to be a revolution for systems software: rich type
system, algebraic types, ownership model allow for fast, correct code
● Slightly more surprising has been Rust’s ability to get small -- which
coupled with its lack of a runtime lets it ﬁt everywhere!
● With its safety and expressive power, Rust represents a quantum leap
over C -- and without losing performance or sacriﬁcing size
● We at Oxide are working on a de novo Rust operating system for the
embedded use case that we will (naturally?) open source; stay tuned!

A new Golden Age!
● Thanks to Moore’s Law, Wright’s Law and the rise of open source, it is
easier to build hardware than ever before!
● We are going to see computers in many more places, posing challenges
to us all to develop reliable, secure, high performing systems
● Software remains essential, but we must not think of it in isolation; we
must co-design the hardware and the software in our systems!
● The systems are open, the communities are welcoming! Let’s build!

Hardware/software Co-design: The Coming Golden Age

More Related Content

Similar to Hardware/software Co-design: The Coming Golden Age

More from bcantrill

Recently uploaded

Hardware/software Co-design: The Coming Golden Age