SlideShare a Scribd company logo
1 of 26
Computer express link(CXL)
Agenda
1.What is CXL?
2.Why we need?
3.CXL vs PCI-E.
4.Where we place CXL.
5.CXL protocols .
6.CXL –Devices types .
What is CXL?
• CXL is an open standard industry-supported cache-coherent
interconnect for processors, memory expansion, and
accelerators.
• Essentially, CXL technology maintains memory coherency
between the CPU memory space and memory on attached
devices.
Why we need CXL ?
1.It aims to provide a high-speed, low-latency connection between CPUs, GPUs, FPGAs,
and other accelerators while enabling coherent memory access between these devices.
2.CXL can work alongside PCIe, extending its capabilities and addressing the needs of
emerging workloads, like artificial intelligence, machine learning, and high-
performance computing.
CXL vs PCI-e
Purpose and Design:
PCI-e (5.0) : PCIe is a widely used interconnect standard primarily
designed for connecting various components within a computer system, such as
graphics cards, storage devices, networking cards.
CXL :designed with a focus on memory coherency and acceleration. It aims
to provide a high-speed, low-latency connection between CPUs, GPUs, FPGAs,
and other accelerators while enabling coherent memory access between these
devices.
Continuation -
Memory Coherency:
PCIe: While PCIe supports peer-to-peer data transfers, it does not inherently
provide memory coherency between devices. In other words, when data is shared
between devices over PCIe, managing data consistency in different caches and
memory spaces can be complex and may require additional software overhead.
CXL: CXL, however, is designed to support memory coherency, which means that
devices connected via CXL can directly access each other's memory and maintain data
consistency with lower software intervention..
Continuation-
Workload Acceleration:
• PCIe: While PCIe is excellent for connecting a wide variety of devices, its lack of
memory coherency support can lead to more significant overhead in certain
scenarios, where data needs to be frequently synchronized between devices.
• CXL: CXL's focus on memory coherency and acceleration makes it well-suited
for tasks that require intensive data sharing and parallel processing, such as
Artificial intelligence and scientific computing.
Where we place CXL .
• CXL builds upon the physical and electrical interfaces of pci-e with
protocols that establish coherency, simplify the software stack, and maintain
compatibility. with existing standards
• CXL controls a PCIe 5 feature that allows alternate protocols to use the
physical PCIe layer.
• CXL transaction protocols are activated only if both sides support CXL.
Otherwise, they operate as PCIe devices.
Terminology / Acronyms
• Accelerator:- Devices that may be used by software running on Host processors to
offload or perform any type of compute or I/O tasks
Examples of accelerators include programmable agents (such as GPU/GPGPU), fixed-
function agents, or reconfigurable agents such as FPGAs.
• Cache coherence : In a multiprocessor system, data inconsistency may occur
among adjacent levels or within the same level of the memory hierarchy.
Conceptual Diagram of Accelerator Attached to Processor via CXL
Cache Coherence Protocols
• MSI protocol (Modified, Shared, Invalid)
• MOSI protocol (Modified, Owned, Shared, Invalid)
• MESI protocol (Modified, Exclusive, Shared, Invalid)
• MOESI protocol (Modified, Owned, Exclusive, Shared, Invalid)
MESI protocol
• The MESI protocol is an Invalidate-based cache coherence protocol, and is one of the most
common protocols that support write-back caches. It is also known as the Illinois protocol
• The letters in the acronym MESI represent four exclusive states that a cache line
can be marked with (encoded using two additional bits):
• STATES:-
• Modified (M) As mentioned above, this term signifies that the data stored in the
cache and main memory are different. This means the data in the cache has
been modified, and the changes need to be reflected in the main memory.
• Exclusive (E)The exclusive term signifies that the data is clean, i.e., the cache
and the main memory hold identical data..
Continuation:-
• Shared - Shared refers to the fact that the cache value contains the most current data copy,
which is then shared across the whole cache as well as main memory.
• Invalid - When a cache block is marked as invalid, it means that it needs to be fetched from
another cache or main memory.
• Operation:-
Continuation :-
• The MESI protocol is defined by a finite-state machine that transitions from one state to
another based on 2 stimuli.
• The first stimulus is the processor specific Read and Write request. For example: A
processor P1 has a Block X in its Cache, and there is a request from the processor to read
or write from that block.
• The second stimulus is given through the bus connecting the processors. In particular the
"Bus side requests" come from other processors that don't have the cache block or the
updated data in their Cache.
• Different type of Processor requests and Bus side requests:
• Processor Requests to Cache include the following operations:
1.PrRd: The processor requests to read a Cache block.
2.PrWr: The processor requests to write a Cache block
Continuation
• Bus side requests are the following:
• BusRd: Snooped request that indicates there is a read request to a Cache block requested by another
processor
• BusRdX: Snooped request that indicates there is a write request to a Cache block requested by another
processor that doesn't already have the block.
• BusUpgr: Snooped request that indicates that there is a write request to a Cache block requested by
another processor that already has that cache block residing in its own cache.
• Flush: Snooped request that indicates that an entire cache block is written back to the main memory by
another processor.
• FlushOpt: Snooped request that indicates that an entire cache block is posted on the bus in order to
supply it to another processor (Cache to Cache transfers).
• Snooping Operation: In a snooping system, all caches on a bus monitor all the
transactions on that bus. Every cache has a copy of the sharing status of every block of
physical memory it has stored. The state of the block is changed according to the State
Diagram of the protocol used. (Refer image above for MESI state diagram). The bus has
snoopers on both sides:
1.Snooper towards the Processor/Cache side.
2.The snooping function on the memory side is done by the Memory controller.
CXL protocols
• 1.CXL io :- This protocol is functionally equivalent to the PCIe protocol—
As the foundational communication protocol, CXL.io is used for device
discovery, enumeration, link-up,
• CXL Cache :-This protocol, which is designed for more specific
applications, enables accelerators to efficiently access and cache host
memory for optimized performance
• CXL Memory:- This protocol enables a host, such as a processor, to
access device-attached memory using load/store commands
Together, these three protocols facilitate the coherent sharing of memory
resources between computing devices
CXL Device- types
 Type 1 Devices: CXL.io + CXL.cache
 Type 2 Devices: CXL.io + CXL.cache + CXL.memory
 Type 3 Devices: CXL.io + CXL.memory
Type 1 - Device with Cache
• Type 1 Devices: Accelerators such as smart NICs
typically lack local memory. Via CXL, these devices can
communicate with the host processor’s DDR memory.
• Type 1 CXL devices have special needs for which having a fully
coherent cache in the device
• The size of cache that can be supported for such devices
depends on the host’s snoop filtering capacity
Type 2 Device
• Type 2 Devices: GPUs, ASICs, and FPGAs are all
equipped with DDR or HBM memory and can use CXL
to make the host processor’s memory locally available
to the accelerator—and the accelerator’s memory
locally available to the CPU.
• The key goal for CXL is to provide a means for the Host to
push operands into device-attached memory and for the
Host to pull results out of device-attached memory
• The Bias Based coherency model defines two
states of bias for device-attached memory:
1.Host bias
2. Device bias.
Continuation :-
• Host bias :- When the device-attached memory is in Host Bias state, it appears to
the device just as regular Host-attached memory does. That is, if the device needs
to access it, it needs to send a request to the Host which will resolve coherency for
the requested line.
• Device bias :- when the device-attached memory is in Device Bias state, the device
is guaranteed that the Host does not have the line in any cache. As such, the device
can access it without sending any transaction (request, snoops, etc.) to the Host
whatsoever
Note :- Host itself sees a uniform view of device-attached memory regardless of the bias state. In
both modes, coherency is preserved for device-attached memory
Type -3 device :-
• A Type 3 CXL Device supports CXL.io and CXL.mem
protocols. An example of a Type 3 CXL device is a memory
expander for the Host as shown in the figure below
• Type 3 Devices: Memory devices can be attached via
CXL to provide additional bandwidth and capacity to
host processors. The type of memory is independent of
the host’s main memory
• The device operates primarily over CXL.mem to service
requests sent from the Host. The CXL.io protocol is primarily
used for device discovery, enumeration, error reporting
and management.
• The CXL.io protocol is permitted to be used by the device for
other IO specific application usages
Flex Bus
• A Flex Bus port allows designs to choose between providing native
PCIe protocol or CXL over a high-bandwidth, off-package link; the
selection happens during link training via alternate protocol
negotiation.
Flex Bus Link Features :-
• Native PCIe mode, full feature support as defined in the PCIe
specification.
• Signaling rate of 32 GT/s, degraded rate of 16GT/s or 8 GT/s in CXL
mode.
• Link width support for x16, x8, x4, x2 (degraded mode), and x1
(degraded mode) in CXL mode.
• Bifurcation (aka Link Subdivision) support to x4 in CXL mode
Flex Bus Layer overview

More Related Content

What's hot

03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_FinalGopi Krishnamurthy
 
MemVerge: The Software Stack for CXL Environments
MemVerge: The Software Stack for CXL EnvironmentsMemVerge: The Software Stack for CXL Environments
MemVerge: The Software Stack for CXL EnvironmentsMemory Fabric Forum
 
Shared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIShared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIAllan Cantle
 
Enfabrica - Bridging the Network and Memory Worlds
Enfabrica - Bridging the Network and Memory WorldsEnfabrica - Bridging the Network and Memory Worlds
Enfabrica - Bridging the Network and Memory WorldsMemory Fabric Forum
 
Microchip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling EcosystemMicrochip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling EcosystemMemory Fabric Forum
 
Verification Strategy for PCI-Express
Verification Strategy for PCI-ExpressVerification Strategy for PCI-Express
Verification Strategy for PCI-ExpressDVClub
 
Reset Metastability Issues.pptx
Reset Metastability Issues.pptxReset Metastability Issues.pptx
Reset Metastability Issues.pptxssuserfb39fe
 
Q1 Memory Fabric Forum: Memory Processor Interface 2023, Focus on CXL
Q1 Memory Fabric Forum: Memory Processor Interface 2023, Focus on CXLQ1 Memory Fabric Forum: Memory Processor Interface 2023, Focus on CXL
Q1 Memory Fabric Forum: Memory Processor Interface 2023, Focus on CXLMemory Fabric Forum
 
Slideshare - PCIe
Slideshare - PCIeSlideshare - PCIe
Slideshare - PCIeJin Wu
 
design-compiler.pdf
design-compiler.pdfdesign-compiler.pdf
design-compiler.pdfFrangoCamila
 
Double data rate (ddr)
Double data rate (ddr)Double data rate (ddr)
Double data rate (ddr)Anderson Huang
 

What's hot (20)

Pcie basic
Pcie basicPcie basic
Pcie basic
 
Past Present and Future of CXL
Past Present and Future of CXLPast Present and Future of CXL
Past Present and Future of CXL
 
Coding style for good synthesis
Coding style for good synthesisCoding style for good synthesis
Coding style for good synthesis
 
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
 
MemVerge: The Software Stack for CXL Environments
MemVerge: The Software Stack for CXL EnvironmentsMemVerge: The Software Stack for CXL Environments
MemVerge: The Software Stack for CXL Environments
 
Ambha axi
Ambha axiAmbha axi
Ambha axi
 
Shared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMIShared Memory Centric Computing with CXL & OMI
Shared Memory Centric Computing with CXL & OMI
 
Enfabrica - Bridging the Network and Memory Worlds
Enfabrica - Bridging the Network and Memory WorldsEnfabrica - Bridging the Network and Memory Worlds
Enfabrica - Bridging the Network and Memory Worlds
 
Microchip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling EcosystemMicrochip: CXL Use Cases and Enabling Ecosystem
Microchip: CXL Use Cases and Enabling Ecosystem
 
Verification Strategy for PCI-Express
Verification Strategy for PCI-ExpressVerification Strategy for PCI-Express
Verification Strategy for PCI-Express
 
DDR
DDRDDR
DDR
 
Reset Metastability Issues.pptx
Reset Metastability Issues.pptxReset Metastability Issues.pptx
Reset Metastability Issues.pptx
 
DDR SDRAMs
DDR SDRAMsDDR SDRAMs
DDR SDRAMs
 
Q1 Memory Fabric Forum: Memory Processor Interface 2023, Focus on CXL
Q1 Memory Fabric Forum: Memory Processor Interface 2023, Focus on CXLQ1 Memory Fabric Forum: Memory Processor Interface 2023, Focus on CXL
Q1 Memory Fabric Forum: Memory Processor Interface 2023, Focus on CXL
 
DFI_Blog
DFI_BlogDFI_Blog
DFI_Blog
 
Slideshare - PCIe
Slideshare - PCIeSlideshare - PCIe
Slideshare - PCIe
 
AMBA 2.0 PPT
AMBA 2.0 PPTAMBA 2.0 PPT
AMBA 2.0 PPT
 
design-compiler.pdf
design-compiler.pdfdesign-compiler.pdf
design-compiler.pdf
 
PCIe
PCIePCIe
PCIe
 
Double data rate (ddr)
Double data rate (ddr)Double data rate (ddr)
Double data rate (ddr)
 

Similar to CXL chapter1 and chapter 2 presentation.pptx

Cache performance-x86-2009
Cache performance-x86-2009Cache performance-x86-2009
Cache performance-x86-2009Léia de Sousa
 
Computer System Architecture
Computer System ArchitectureComputer System Architecture
Computer System ArchitectureBrenda Debra
 
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSINGADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING Zena Abo-Altaheen
 
Parallel Computing - Lec 3
Parallel Computing - Lec 3Parallel Computing - Lec 3
Parallel Computing - Lec 3Shah Zaib
 
WN Memory Tiering WP Mar2023.pdf
WN Memory Tiering WP Mar2023.pdfWN Memory Tiering WP Mar2023.pdf
WN Memory Tiering WP Mar2023.pdfRochanSankar1
 
Computer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerComputer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerAmrutaMehata
 
Q1 Memory Fabric Forum: Intel Enabling Compute Express Link (CXL)
Q1 Memory Fabric Forum: Intel Enabling Compute Express Link (CXL)Q1 Memory Fabric Forum: Intel Enabling Compute Express Link (CXL)
Q1 Memory Fabric Forum: Intel Enabling Compute Express Link (CXL)Memory Fabric Forum
 
Computer Architecture Chapter 2 BUS
Computer Architecture Chapter 2 BUSComputer Architecture Chapter 2 BUS
Computer Architecture Chapter 2 BUSAlyssaAina1
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialmadhuinturi
 
Real Time Operating System
Real Time Operating SystemReal Time Operating System
Real Time Operating SystemSharad Pandey
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Flynn's Taxonomy
Flynn's TaxonomyFlynn's Taxonomy
Flynn's TaxonomyAshish KC
 

Similar to CXL chapter1 and chapter 2 presentation.pptx (20)

Cache performance-x86-2009
Cache performance-x86-2009Cache performance-x86-2009
Cache performance-x86-2009
 
Computer System Architecture
Computer System ArchitectureComputer System Architecture
Computer System Architecture
 
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSINGADVANCED COMPUTER ARCHITECTUREAND PARALLEL PROCESSING
ADVANCED COMPUTER ARCHITECTURE AND PARALLEL PROCESSING
 
Parallel Computing - Lec 3
Parallel Computing - Lec 3Parallel Computing - Lec 3
Parallel Computing - Lec 3
 
linux kernel overview 2013
linux kernel overview 2013linux kernel overview 2013
linux kernel overview 2013
 
IEEExeonmem
IEEExeonmemIEEExeonmem
IEEExeonmem
 
WN Memory Tiering WP Mar2023.pdf
WN Memory Tiering WP Mar2023.pdfWN Memory Tiering WP Mar2023.pdf
WN Memory Tiering WP Mar2023.pdf
 
Computer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and MicrocontrollerComputer Organization: Introduction to Microprocessor and Microcontroller
Computer Organization: Introduction to Microprocessor and Microcontroller
 
NUMA
NUMANUMA
NUMA
 
Q1 Memory Fabric Forum: Intel Enabling Compute Express Link (CXL)
Q1 Memory Fabric Forum: Intel Enabling Compute Express Link (CXL)Q1 Memory Fabric Forum: Intel Enabling Compute Express Link (CXL)
Q1 Memory Fabric Forum: Intel Enabling Compute Express Link (CXL)
 
22CS201 COA
22CS201 COA22CS201 COA
22CS201 COA
 
Week5
Week5Week5
Week5
 
Computer Architecture Chapter 2 BUS
Computer Architecture Chapter 2 BUSComputer Architecture Chapter 2 BUS
Computer Architecture Chapter 2 BUS
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorial
 
Real Time Operating System
Real Time Operating SystemReal Time Operating System
Real Time Operating System
 
COA notes
COA notesCOA notes
COA notes
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Unit ii.arc of tms320 c5 xx
Unit ii.arc of tms320 c5 xxUnit ii.arc of tms320 c5 xx
Unit ii.arc of tms320 c5 xx
 
Cray xt3
Cray xt3Cray xt3
Cray xt3
 
Flynn's Taxonomy
Flynn's TaxonomyFlynn's Taxonomy
Flynn's Taxonomy
 

Recently uploaded

Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzohaibmir069
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 

Recently uploaded (20)

Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
zoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistanzoogeography of pakistan.pptx fauna of Pakistan
zoogeography of pakistan.pptx fauna of Pakistan
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 

CXL chapter1 and chapter 2 presentation.pptx

  • 2. Agenda 1.What is CXL? 2.Why we need? 3.CXL vs PCI-E. 4.Where we place CXL. 5.CXL protocols . 6.CXL –Devices types .
  • 3. What is CXL? • CXL is an open standard industry-supported cache-coherent interconnect for processors, memory expansion, and accelerators. • Essentially, CXL technology maintains memory coherency between the CPU memory space and memory on attached devices.
  • 4. Why we need CXL ? 1.It aims to provide a high-speed, low-latency connection between CPUs, GPUs, FPGAs, and other accelerators while enabling coherent memory access between these devices. 2.CXL can work alongside PCIe, extending its capabilities and addressing the needs of emerging workloads, like artificial intelligence, machine learning, and high- performance computing.
  • 5. CXL vs PCI-e Purpose and Design: PCI-e (5.0) : PCIe is a widely used interconnect standard primarily designed for connecting various components within a computer system, such as graphics cards, storage devices, networking cards. CXL :designed with a focus on memory coherency and acceleration. It aims to provide a high-speed, low-latency connection between CPUs, GPUs, FPGAs, and other accelerators while enabling coherent memory access between these devices.
  • 6. Continuation - Memory Coherency: PCIe: While PCIe supports peer-to-peer data transfers, it does not inherently provide memory coherency between devices. In other words, when data is shared between devices over PCIe, managing data consistency in different caches and memory spaces can be complex and may require additional software overhead. CXL: CXL, however, is designed to support memory coherency, which means that devices connected via CXL can directly access each other's memory and maintain data consistency with lower software intervention..
  • 7. Continuation- Workload Acceleration: • PCIe: While PCIe is excellent for connecting a wide variety of devices, its lack of memory coherency support can lead to more significant overhead in certain scenarios, where data needs to be frequently synchronized between devices. • CXL: CXL's focus on memory coherency and acceleration makes it well-suited for tasks that require intensive data sharing and parallel processing, such as Artificial intelligence and scientific computing.
  • 8. Where we place CXL . • CXL builds upon the physical and electrical interfaces of pci-e with protocols that establish coherency, simplify the software stack, and maintain compatibility. with existing standards • CXL controls a PCIe 5 feature that allows alternate protocols to use the physical PCIe layer. • CXL transaction protocols are activated only if both sides support CXL. Otherwise, they operate as PCIe devices.
  • 9. Terminology / Acronyms • Accelerator:- Devices that may be used by software running on Host processors to offload or perform any type of compute or I/O tasks Examples of accelerators include programmable agents (such as GPU/GPGPU), fixed- function agents, or reconfigurable agents such as FPGAs. • Cache coherence : In a multiprocessor system, data inconsistency may occur among adjacent levels or within the same level of the memory hierarchy.
  • 10. Conceptual Diagram of Accelerator Attached to Processor via CXL
  • 11. Cache Coherence Protocols • MSI protocol (Modified, Shared, Invalid) • MOSI protocol (Modified, Owned, Shared, Invalid) • MESI protocol (Modified, Exclusive, Shared, Invalid) • MOESI protocol (Modified, Owned, Exclusive, Shared, Invalid)
  • 12. MESI protocol • The MESI protocol is an Invalidate-based cache coherence protocol, and is one of the most common protocols that support write-back caches. It is also known as the Illinois protocol • The letters in the acronym MESI represent four exclusive states that a cache line can be marked with (encoded using two additional bits): • STATES:- • Modified (M) As mentioned above, this term signifies that the data stored in the cache and main memory are different. This means the data in the cache has been modified, and the changes need to be reflected in the main memory. • Exclusive (E)The exclusive term signifies that the data is clean, i.e., the cache and the main memory hold identical data..
  • 13. Continuation:- • Shared - Shared refers to the fact that the cache value contains the most current data copy, which is then shared across the whole cache as well as main memory. • Invalid - When a cache block is marked as invalid, it means that it needs to be fetched from another cache or main memory. • Operation:-
  • 14. Continuation :- • The MESI protocol is defined by a finite-state machine that transitions from one state to another based on 2 stimuli. • The first stimulus is the processor specific Read and Write request. For example: A processor P1 has a Block X in its Cache, and there is a request from the processor to read or write from that block. • The second stimulus is given through the bus connecting the processors. In particular the "Bus side requests" come from other processors that don't have the cache block or the updated data in their Cache. • Different type of Processor requests and Bus side requests: • Processor Requests to Cache include the following operations: 1.PrRd: The processor requests to read a Cache block. 2.PrWr: The processor requests to write a Cache block
  • 15. Continuation • Bus side requests are the following: • BusRd: Snooped request that indicates there is a read request to a Cache block requested by another processor • BusRdX: Snooped request that indicates there is a write request to a Cache block requested by another processor that doesn't already have the block. • BusUpgr: Snooped request that indicates that there is a write request to a Cache block requested by another processor that already has that cache block residing in its own cache. • Flush: Snooped request that indicates that an entire cache block is written back to the main memory by another processor. • FlushOpt: Snooped request that indicates that an entire cache block is posted on the bus in order to supply it to another processor (Cache to Cache transfers). • Snooping Operation: In a snooping system, all caches on a bus monitor all the transactions on that bus. Every cache has a copy of the sharing status of every block of physical memory it has stored. The state of the block is changed according to the State Diagram of the protocol used. (Refer image above for MESI state diagram). The bus has snoopers on both sides: 1.Snooper towards the Processor/Cache side. 2.The snooping function on the memory side is done by the Memory controller.
  • 16.
  • 17.
  • 18. CXL protocols • 1.CXL io :- This protocol is functionally equivalent to the PCIe protocol— As the foundational communication protocol, CXL.io is used for device discovery, enumeration, link-up, • CXL Cache :-This protocol, which is designed for more specific applications, enables accelerators to efficiently access and cache host memory for optimized performance • CXL Memory:- This protocol enables a host, such as a processor, to access device-attached memory using load/store commands Together, these three protocols facilitate the coherent sharing of memory resources between computing devices
  • 19. CXL Device- types  Type 1 Devices: CXL.io + CXL.cache  Type 2 Devices: CXL.io + CXL.cache + CXL.memory  Type 3 Devices: CXL.io + CXL.memory
  • 20. Type 1 - Device with Cache • Type 1 Devices: Accelerators such as smart NICs typically lack local memory. Via CXL, these devices can communicate with the host processor’s DDR memory. • Type 1 CXL devices have special needs for which having a fully coherent cache in the device • The size of cache that can be supported for such devices depends on the host’s snoop filtering capacity
  • 21. Type 2 Device • Type 2 Devices: GPUs, ASICs, and FPGAs are all equipped with DDR or HBM memory and can use CXL to make the host processor’s memory locally available to the accelerator—and the accelerator’s memory locally available to the CPU. • The key goal for CXL is to provide a means for the Host to push operands into device-attached memory and for the Host to pull results out of device-attached memory • The Bias Based coherency model defines two states of bias for device-attached memory: 1.Host bias 2. Device bias.
  • 22. Continuation :- • Host bias :- When the device-attached memory is in Host Bias state, it appears to the device just as regular Host-attached memory does. That is, if the device needs to access it, it needs to send a request to the Host which will resolve coherency for the requested line. • Device bias :- when the device-attached memory is in Device Bias state, the device is guaranteed that the Host does not have the line in any cache. As such, the device can access it without sending any transaction (request, snoops, etc.) to the Host whatsoever Note :- Host itself sees a uniform view of device-attached memory regardless of the bias state. In both modes, coherency is preserved for device-attached memory
  • 23. Type -3 device :- • A Type 3 CXL Device supports CXL.io and CXL.mem protocols. An example of a Type 3 CXL device is a memory expander for the Host as shown in the figure below • Type 3 Devices: Memory devices can be attached via CXL to provide additional bandwidth and capacity to host processors. The type of memory is independent of the host’s main memory • The device operates primarily over CXL.mem to service requests sent from the Host. The CXL.io protocol is primarily used for device discovery, enumeration, error reporting and management. • The CXL.io protocol is permitted to be used by the device for other IO specific application usages
  • 24.
  • 25. Flex Bus • A Flex Bus port allows designs to choose between providing native PCIe protocol or CXL over a high-bandwidth, off-package link; the selection happens during link training via alternate protocol negotiation. Flex Bus Link Features :- • Native PCIe mode, full feature support as defined in the PCIe specification. • Signaling rate of 32 GT/s, degraded rate of 16GT/s or 8 GT/s in CXL mode. • Link width support for x16, x8, x4, x2 (degraded mode), and x1 (degraded mode) in CXL mode. • Bifurcation (aka Link Subdivision) support to x4 in CXL mode
  • 26. Flex Bus Layer overview