• Open POWER ISA:
Opening POWER Instruction Set Architecture (ISA),
inclusive of patent rights.
• Open Reference Designs:
Open sourcing a softcore implementation of the
POWER ISA as well as reference designs for the
architecture-agnostic Open Coherent Accelerator
Processor Interface (OpenCAPI) and Open Memory
Interface (OMI).
• Open Governance:
OpenPOWER Foundation joining the Linux
Foundation
IBM expands open hardware
ecosystem with major
contributions to community
August 20, 2019
OpenPOWER
mechanical
electrical
firmware
protocols
interconnects
Hardware – ISA -> System
kernel
drivers
applications
compute/network/storage fabric
identity
orchestration
Open Stack
Linux
Open Compute
Completely open system stack, from the foundation of the processor
instruction set through the software stack
Enabling Innovation
• OpenPOWER = enable system level innovation
• Opening the POWER ISA = enable architecture innovation
• Open Memory Interface = enable memory innovation
• OpenCAPI = enable accelerator innovation
350+
Members
35
Countries
100
ISVs
This is What A Revolution Looks Like © 2018 OpenPOWER Foundation
Innovation at all
Layers of the
Stack
100k+ Linux Applications
Running on POWER
2500 ISVs Linux ISVs
developing on
POWER
Partners
Bring
Systems
to Market
150+ OpenPOWER Ready
Certified Products
20+ Systems Manufacturers
40+ POWER-based systems
shipping or in development
100+ Collaborative innovations
under way
Open Source Firmware
Entire firmware stack is Open
Source and available on
github
© 2019 IBM Corporation 6
POWER Architecture
POWER / PowerPC
Server & Desktop
PowerPC
Embedded
POWER
Convergence enabled by:
• Little Endian (POWER8 2014)
• Linux-friendly RADIX page table
management (POWER9 2017)
User code compatible
• Mem Mgmt for low capacity
• Avoid some complex instructions
• >20 licensees, >20 operating sys
Open POWER ISA
Unencumbered Open Innovation
• Rights to create, distribute, license and sell POWER ISA Cores
• Implementations of the POWER ISA represented by software, hardware description language (HDL), or integrated circuit
design
Unified Software Ecosystem
• Prevent fracturing of POWER’s mature and extensive software ecosystem
• Compliancy required to make physical implementations of POWER ISA Cores
• Patent rights granted to make POWER ISA Compliant Chips
Freedom of Choice
• Four subsets of the architecture to choose from for base compliancy
• Optionally implement additional compliant features
• Architectural resources defined within ISA for custom extensions
Collaborative Innovation of the POWER ISA
• OpenPOWER Foundation Workgroup with open governance
• Allow contributions from both OpenPOWER Foundation members and non-members
• Unanimous vote for changes that would break backward compatibility
• Majority vote for compatible changes
Microwatt: Make your own POWER CPU, Today at 11:20am
© 2019 IBM Corporation 9
• Two month part time effort
• Tiny Power Core
• Open Power ISA scaler fixed point subset
• Single issue, in order
• Reuse from Open Hardware world – UART
• Written in VHDL 2008
• ghdl Open Source simulation tools
• Xilinx Vivado for FPGA synthesis
• Micropython
• Small embedded Python interpreter
• No modifications to generic code
• Platform specific code (startup, console etc).
• https://github.com/antonblanchard/microwatt
• Open Source community extensions
• Zephyr IoT OS ported
• Upgrading to Linux capable
Scale Out
Direct Attach Memory
Open Memory Interface
Scale Up
Buffered Memory
10/31/19
Low latency access
Commodity packaging form factor
Superior RAS, High bandwidth, High Capacity
Agnostic interface for alternate memory innovations
Same Open Memory Interface used for all Systems and Memory Technologies
Open Memory Interface (OMI)
Near Tier
Extreme Bandwidth
Low Capacity
Commodity
Low Latency
Low Cost
Enterprise
RAS
Capacity
Bandwidth
Storage Class
Extreme Capacity
Persistence
IBM Confidential
DRAM DIMM Comparison
• Technology agnostic
• Low cost
• Ultra-scale system density
• Enterprise reliability
• Low-latency
• High bandwidth
10/31/19 Approximate Scale
Econom
y
Ultra-scale
JEDEC DDR DIMM
IBM Centaur DIMM
OMI DDIMM
• Designed to support range of devices
• Coherent Caching Accelerators
• Network Controllers
• Differentiated Memory
• High Bandwidth
• Low Latency
• Storage Class Memory
• Storage Controllers
• Asymmetric design, endpoint optimized for host and device attach
• ISA of Host Architecture: Need to hide difference in Coherence, Memory Model, Address Translation, etc.
• Design schedule: The design schedule of a high performance CPU host is typically on the order of multiple years,
conversely, accelerator devices have much shorter development cycles, typically less than a year.
• Timing Corner: ASIC and FPGA technologies run at lower frequencies and timing optimization as CPUs.
• Plurality of devices: Effort in the host, both IP and circuit resource, have a multiplicative effect.
• Trust: Attached devices are susceptible to both intentional and unintentional trust violations
• Cache coherence: Hosts have high variability in protocol. Host cannot trust attached device to obey rules.
OpenCAPI Design Goals
New Open Source IP
CAPI Flash – Accelerated NVMe Controller FPGA IP
The CAPI Flash accelerated NVMe controller IP is designed for the Bittware 250SP FPGA NVMe adapter to work
with the CAPI 2.0 protocol. The purpose of the IP is to reduce CPU kernel overhead for storage transactions.
This allows 6x the number of 4k random read IOPs per core vs standard NVMe. It is used in conjunction with
the CXL Flash Adapter Driver kernel module in Linux.
Features:
• IP includes an NVMe controller and supports from 1 – 4 attached NVMe devices
• IP interfaces directly with the CAPI 2.0 PSL9 FPGA IP available at https://www-
355.ibm.com/systems/power/openpower/welcome.xhtml
• Supports two modes of operation
• Legacy mode – any flash device (LUN) accessed as a regular disk drive (i.e.: /dev/sdc)
• block mode – direct user space access via a special block library
• http://github.com/open-power/capiflash
• supports both raw single LUN access or multiple virtual LUNs
Performance (NVMe vs CAPI FLASH using Samsung PM1725a U.2 drive):
• 6x 4k random read IOPs per core
• 2.5x 4k random write IOPs per core
• 7x 4k random read IOPs per LUN (1 job, queue depth 128)
• 2.5x 4k random write IOPs per LUN (1-20 jobs, queue depth 32)
• 50% lower latency 4k random IOPs (1-20 jobs, queue depth > 32)
https://www.kernel.org/doc/Documentation/powerpc/cxlflash.txt
FlashGT
Block-API
SCSI layer
Cxlflash dd
CapiFLASH
Block dd
CAPI Flash Stack
User space
Kernel
Cxl
OpenCAPI Acceleration Framework – OC-Accel
The Integrated Development Environment (IDE) for creating application FPGA-based accelerators is
available on github now!
https://github.com/OpenCAPI/oc-accel
OpenCAPI Accleration Framework: Unleash the Power of Customized Accelerators
Today at 14:40 – Come and hear from Lu Yong and Alexandre Castellane!
OpenCAPI Acceleration Framework, abbreviated as OC-Accel, is a platform to enable programmers and
computer engineers to quickly create FPGA-based accelerations. The acceleration action's software part
and hardware part share the server host memory data through OpenCAPI interface. With it, people can
quickly design an accelerator and benefit from the bandwidth, latency, coherency and programability
advantages of OpenCAPI.
More to Come
Will continue to seed the community and build value through regular open source contributions
Collaboration within the open community in order to develop an optimal IP roadmap
Large IP portfolio to leverage – both in Design IP and Commercial Grade Design Tools

0 foundation update__final - Mendy Furmanek

  • 2.
    • Open POWERISA: Opening POWER Instruction Set Architecture (ISA), inclusive of patent rights. • Open Reference Designs: Open sourcing a softcore implementation of the POWER ISA as well as reference designs for the architecture-agnostic Open Coherent Accelerator Processor Interface (OpenCAPI) and Open Memory Interface (OMI). • Open Governance: OpenPOWER Foundation joining the Linux Foundation IBM expands open hardware ecosystem with major contributions to community August 20, 2019
  • 3.
    OpenPOWER mechanical electrical firmware protocols interconnects Hardware – ISA-> System kernel drivers applications compute/network/storage fabric identity orchestration Open Stack Linux Open Compute Completely open system stack, from the foundation of the processor instruction set through the software stack
  • 4.
    Enabling Innovation • OpenPOWER= enable system level innovation • Opening the POWER ISA = enable architecture innovation • Open Memory Interface = enable memory innovation • OpenCAPI = enable accelerator innovation
  • 5.
    350+ Members 35 Countries 100 ISVs This is WhatA Revolution Looks Like © 2018 OpenPOWER Foundation Innovation at all Layers of the Stack 100k+ Linux Applications Running on POWER 2500 ISVs Linux ISVs developing on POWER Partners Bring Systems to Market 150+ OpenPOWER Ready Certified Products 20+ Systems Manufacturers 40+ POWER-based systems shipping or in development 100+ Collaborative innovations under way
  • 6.
    Open Source Firmware Entirefirmware stack is Open Source and available on github © 2019 IBM Corporation 6
  • 7.
    POWER Architecture POWER /PowerPC Server & Desktop PowerPC Embedded POWER Convergence enabled by: • Little Endian (POWER8 2014) • Linux-friendly RADIX page table management (POWER9 2017) User code compatible • Mem Mgmt for low capacity • Avoid some complex instructions • >20 licensees, >20 operating sys
  • 8.
    Open POWER ISA UnencumberedOpen Innovation • Rights to create, distribute, license and sell POWER ISA Cores • Implementations of the POWER ISA represented by software, hardware description language (HDL), or integrated circuit design Unified Software Ecosystem • Prevent fracturing of POWER’s mature and extensive software ecosystem • Compliancy required to make physical implementations of POWER ISA Cores • Patent rights granted to make POWER ISA Compliant Chips Freedom of Choice • Four subsets of the architecture to choose from for base compliancy • Optionally implement additional compliant features • Architectural resources defined within ISA for custom extensions Collaborative Innovation of the POWER ISA • OpenPOWER Foundation Workgroup with open governance • Allow contributions from both OpenPOWER Foundation members and non-members • Unanimous vote for changes that would break backward compatibility • Majority vote for compatible changes
  • 9.
    Microwatt: Make yourown POWER CPU, Today at 11:20am © 2019 IBM Corporation 9 • Two month part time effort • Tiny Power Core • Open Power ISA scaler fixed point subset • Single issue, in order • Reuse from Open Hardware world – UART • Written in VHDL 2008 • ghdl Open Source simulation tools • Xilinx Vivado for FPGA synthesis • Micropython • Small embedded Python interpreter • No modifications to generic code • Platform specific code (startup, console etc). • https://github.com/antonblanchard/microwatt • Open Source community extensions • Zephyr IoT OS ported • Upgrading to Linux capable
  • 10.
    Scale Out Direct AttachMemory Open Memory Interface Scale Up Buffered Memory 10/31/19 Low latency access Commodity packaging form factor Superior RAS, High bandwidth, High Capacity Agnostic interface for alternate memory innovations Same Open Memory Interface used for all Systems and Memory Technologies Open Memory Interface (OMI) Near Tier Extreme Bandwidth Low Capacity Commodity Low Latency Low Cost Enterprise RAS Capacity Bandwidth Storage Class Extreme Capacity Persistence IBM Confidential
  • 11.
    DRAM DIMM Comparison •Technology agnostic • Low cost • Ultra-scale system density • Enterprise reliability • Low-latency • High bandwidth 10/31/19 Approximate Scale Econom y Ultra-scale JEDEC DDR DIMM IBM Centaur DIMM OMI DDIMM
  • 12.
    • Designed tosupport range of devices • Coherent Caching Accelerators • Network Controllers • Differentiated Memory • High Bandwidth • Low Latency • Storage Class Memory • Storage Controllers • Asymmetric design, endpoint optimized for host and device attach • ISA of Host Architecture: Need to hide difference in Coherence, Memory Model, Address Translation, etc. • Design schedule: The design schedule of a high performance CPU host is typically on the order of multiple years, conversely, accelerator devices have much shorter development cycles, typically less than a year. • Timing Corner: ASIC and FPGA technologies run at lower frequencies and timing optimization as CPUs. • Plurality of devices: Effort in the host, both IP and circuit resource, have a multiplicative effect. • Trust: Attached devices are susceptible to both intentional and unintentional trust violations • Cache coherence: Hosts have high variability in protocol. Host cannot trust attached device to obey rules. OpenCAPI Design Goals
  • 13.
  • 14.
    CAPI Flash –Accelerated NVMe Controller FPGA IP The CAPI Flash accelerated NVMe controller IP is designed for the Bittware 250SP FPGA NVMe adapter to work with the CAPI 2.0 protocol. The purpose of the IP is to reduce CPU kernel overhead for storage transactions. This allows 6x the number of 4k random read IOPs per core vs standard NVMe. It is used in conjunction with the CXL Flash Adapter Driver kernel module in Linux. Features: • IP includes an NVMe controller and supports from 1 – 4 attached NVMe devices • IP interfaces directly with the CAPI 2.0 PSL9 FPGA IP available at https://www- 355.ibm.com/systems/power/openpower/welcome.xhtml • Supports two modes of operation • Legacy mode – any flash device (LUN) accessed as a regular disk drive (i.e.: /dev/sdc) • block mode – direct user space access via a special block library • http://github.com/open-power/capiflash • supports both raw single LUN access or multiple virtual LUNs Performance (NVMe vs CAPI FLASH using Samsung PM1725a U.2 drive): • 6x 4k random read IOPs per core • 2.5x 4k random write IOPs per core • 7x 4k random read IOPs per LUN (1 job, queue depth 128) • 2.5x 4k random write IOPs per LUN (1-20 jobs, queue depth 32) • 50% lower latency 4k random IOPs (1-20 jobs, queue depth > 32) https://www.kernel.org/doc/Documentation/powerpc/cxlflash.txt FlashGT Block-API SCSI layer Cxlflash dd CapiFLASH Block dd CAPI Flash Stack User space Kernel Cxl
  • 15.
    OpenCAPI Acceleration Framework– OC-Accel The Integrated Development Environment (IDE) for creating application FPGA-based accelerators is available on github now! https://github.com/OpenCAPI/oc-accel OpenCAPI Accleration Framework: Unleash the Power of Customized Accelerators Today at 14:40 – Come and hear from Lu Yong and Alexandre Castellane! OpenCAPI Acceleration Framework, abbreviated as OC-Accel, is a platform to enable programmers and computer engineers to quickly create FPGA-based accelerations. The acceleration action's software part and hardware part share the server host memory data through OpenCAPI interface. With it, people can quickly design an accelerator and benefit from the bandwidth, latency, coherency and programability advantages of OpenCAPI.
  • 16.
    More to Come Willcontinue to seed the community and build value through regular open source contributions Collaboration within the open community in order to develop an optimal IP roadmap Large IP portfolio to leverage – both in Design IP and Commercial Grade Design Tools