This slide was presented by Mendy Furmanek at OpenPOWER summit EU 2019. The original one is uploaded at:
https://static.sched.com/hosted_files/opeu19/9c/Final%20-%20Mendy%20F..pdf
How to Troubleshoot Apps for the Modern Connected Worker
0 foundation update__final - Mendy Furmanek
1.
2. • Open POWER ISA:
Opening POWER Instruction Set Architecture (ISA),
inclusive of patent rights.
• Open Reference Designs:
Open sourcing a softcore implementation of the
POWER ISA as well as reference designs for the
architecture-agnostic Open Coherent Accelerator
Processor Interface (OpenCAPI) and Open Memory
Interface (OMI).
• Open Governance:
OpenPOWER Foundation joining the Linux
Foundation
IBM expands open hardware
ecosystem with major
contributions to community
August 20, 2019
7. POWER Architecture
POWER / PowerPC
Server & Desktop
PowerPC
Embedded
POWER
Convergence enabled by:
• Little Endian (POWER8 2014)
• Linux-friendly RADIX page table
management (POWER9 2017)
User code compatible
• Mem Mgmt for low capacity
• Avoid some complex instructions
• >20 licensees, >20 operating sys
8. Open POWER ISA
Unencumbered Open Innovation
• Rights to create, distribute, license and sell POWER ISA Cores
• Implementations of the POWER ISA represented by software, hardware description language (HDL), or integrated circuit
design
Unified Software Ecosystem
• Prevent fracturing of POWER’s mature and extensive software ecosystem
• Compliancy required to make physical implementations of POWER ISA Cores
• Patent rights granted to make POWER ISA Compliant Chips
Freedom of Choice
• Four subsets of the architecture to choose from for base compliancy
• Optionally implement additional compliant features
• Architectural resources defined within ISA for custom extensions
Collaborative Innovation of the POWER ISA
• OpenPOWER Foundation Workgroup with open governance
• Allow contributions from both OpenPOWER Foundation members and non-members
• Unanimous vote for changes that would break backward compatibility
• Majority vote for compatible changes
10. Scale Out
Direct Attach Memory
Open Memory Interface
Scale Up
Buffered Memory
10/31/19
Low latency access
Commodity packaging form factor
Superior RAS, High bandwidth, High Capacity
Agnostic interface for alternate memory innovations
Same Open Memory Interface used for all Systems and Memory Technologies
Open Memory Interface (OMI)
Near Tier
Extreme Bandwidth
Low Capacity
Commodity
Low Latency
Low Cost
Enterprise
RAS
Capacity
Bandwidth
Storage Class
Extreme Capacity
Persistence
IBM Confidential
11. DRAM DIMM Comparison
• Technology agnostic
• Low cost
• Ultra-scale system density
• Enterprise reliability
• Low-latency
• High bandwidth
10/31/19 Approximate Scale
Econom
y
Ultra-scale
JEDEC DDR DIMM
IBM Centaur DIMM
OMI DDIMM
12. • Designed to support range of devices
• Coherent Caching Accelerators
• Network Controllers
• Differentiated Memory
• High Bandwidth
• Low Latency
• Storage Class Memory
• Storage Controllers
• Asymmetric design, endpoint optimized for host and device attach
• ISA of Host Architecture: Need to hide difference in Coherence, Memory Model, Address Translation, etc.
• Design schedule: The design schedule of a high performance CPU host is typically on the order of multiple years,
conversely, accelerator devices have much shorter development cycles, typically less than a year.
• Timing Corner: ASIC and FPGA technologies run at lower frequencies and timing optimization as CPUs.
• Plurality of devices: Effort in the host, both IP and circuit resource, have a multiplicative effect.
• Trust: Attached devices are susceptible to both intentional and unintentional trust violations
• Cache coherence: Hosts have high variability in protocol. Host cannot trust attached device to obey rules.
OpenCAPI Design Goals
14. CAPI Flash – Accelerated NVMe Controller FPGA IP
The CAPI Flash accelerated NVMe controller IP is designed for the Bittware 250SP FPGA NVMe adapter to work
with the CAPI 2.0 protocol. The purpose of the IP is to reduce CPU kernel overhead for storage transactions.
This allows 6x the number of 4k random read IOPs per core vs standard NVMe. It is used in conjunction with
the CXL Flash Adapter Driver kernel module in Linux.
Features:
• IP includes an NVMe controller and supports from 1 – 4 attached NVMe devices
• IP interfaces directly with the CAPI 2.0 PSL9 FPGA IP available at https://www-
355.ibm.com/systems/power/openpower/welcome.xhtml
• Supports two modes of operation
• Legacy mode – any flash device (LUN) accessed as a regular disk drive (i.e.: /dev/sdc)
• block mode – direct user space access via a special block library
• http://github.com/open-power/capiflash
• supports both raw single LUN access or multiple virtual LUNs
Performance (NVMe vs CAPI FLASH using Samsung PM1725a U.2 drive):
• 6x 4k random read IOPs per core
• 2.5x 4k random write IOPs per core
• 7x 4k random read IOPs per LUN (1 job, queue depth 128)
• 2.5x 4k random write IOPs per LUN (1-20 jobs, queue depth 32)
• 50% lower latency 4k random IOPs (1-20 jobs, queue depth > 32)
https://www.kernel.org/doc/Documentation/powerpc/cxlflash.txt
FlashGT
Block-API
SCSI layer
Cxlflash dd
CapiFLASH
Block dd
CAPI Flash Stack
User space
Kernel
Cxl
15. OpenCAPI Acceleration Framework – OC-Accel
The Integrated Development Environment (IDE) for creating application FPGA-based accelerators is
available on github now!
https://github.com/OpenCAPI/oc-accel
OpenCAPI Accleration Framework: Unleash the Power of Customized Accelerators
Today at 14:40 – Come and hear from Lu Yong and Alexandre Castellane!
OpenCAPI Acceleration Framework, abbreviated as OC-Accel, is a platform to enable programmers and
computer engineers to quickly create FPGA-based accelerations. The acceleration action's software part
and hardware part share the server host memory data through OpenCAPI interface. With it, people can
quickly design an accelerator and benefit from the bandwidth, latency, coherency and programability
advantages of OpenCAPI.
16. More to Come
Will continue to seed the community and build value through regular open source contributions
Collaboration within the open community in order to develop an optimal IP roadmap
Large IP portfolio to leverage – both in Design IP and Commercial Grade Design Tools