SlideShare a Scribd company logo
Preliminary XSX fact finding
(it is not about 100% correctness,
it is about journey for connecting the dot), so dont use this as a fact!!!
plus RDNA1 Slide and Cache data size
by @blueisviolet
AMD Slide / Navi Fact
In Navi, two Compute Engines form a Workgroup Processor, and five of those form an
Asynchronous Compute Engine (ACE)
important link RDNA1 (Navi10)
https://www.amd.com/system/files/documents/rdna-
whitepaper.pdf
https://gpuopen.com/wp-
content/uploads/2019/08/RDNA_Architecture_public.pdf
https://gpuopen.com/rdna-shader-instruction-set-architecture-
document-now-available/
AMD Slide / Navi Fact
Some interesting of per SIMD32 block
- VGPR (128KB)
- SGPR (10KB)
also hmmm so ALU provide: x32 ALU + DP unit x2 + Transcedental? unit x8
32KB I$ for 4 SPU ( 2 WGP)
16KB K$ (data cache) per 2 WGP
AMD Slide / Navi Fact
In Navi, two Compute Engines (CU) form a Workgroup Processor, and five of those form
an Asynchronous Compute Engine (ACE)
AMD Slide / Navi Fact
1 WGP (2CU)=2xL0(2 x 16KB) + I$ (32KB) + K$ (16KB)+VGPR(512KB)+SGPR(40KB)+128KB LDS
5 WGP (1 Shader Array) (1 ACE) = connect to 128KB L1,
1 CU = 256KB VGPR + 20KB SGPR
So Why ? Seems the E F G has similar structure
Zoomed, in normal CU, the 4 5 6 7 usually different
it is like 10 Group, instead 4 like Navi, it is 6, what we know it seems
per WGP can be customized, also remember X1 Hotchip
on higher level diagram they showed as per 6 CU
Another thing is from Github, it is said 3 GDS and 6
LDS per group. Also provided X1 Hotchip slide (per 6)

More Related Content

What's hot

Tobias Oetiker: RRDtool - how to make it sit up and beg
Tobias Oetiker: RRDtool - how to make it sit up and begTobias Oetiker: RRDtool - how to make it sit up and beg
Tobias Oetiker: RRDtool - how to make it sit up and beg
Friprogsenteret
 
MapServer #ProTips 2015
MapServer #ProTips 2015MapServer #ProTips 2015
MapServer #ProTips 2015
Jeff McKenna
 
Federated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation TherapyFederated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation Therapy
CESGA Centro de Supercomputación de Galicia
 
Nodester Architecture overview & roadmap
Nodester Architecture overview & roadmapNodester Architecture overview & roadmap
Nodester Architecture overview & roadmap
cmatthieu
 
nodester Architecture overview & roadmap
nodester Architecture overview & roadmapnodester Architecture overview & roadmap
nodester Architecture overview & roadmap
wearefractal
 
Map Analytics in Starcraft II
Map Analytics in Starcraft IIMap Analytics in Starcraft II
Map Analytics in Starcraft II
gy8
 
Using GPUs to accelerate nonstiff and stiff chemical kinetics in combustion s...
Using GPUs to accelerate nonstiff and stiff chemical kinetics in combustion s...Using GPUs to accelerate nonstiff and stiff chemical kinetics in combustion s...
Using GPUs to accelerate nonstiff and stiff chemical kinetics in combustion s...
Oregon State University
 
Debugging CUDA applications
Debugging CUDA applicationsDebugging CUDA applications
Debugging CUDA applications
Rogue Wave Software
 
GPU Programming with CUDA
GPU Programming with CUDAGPU Programming with CUDA
GPU Programming with CUDA
Filipo Mór
 
Architectural Analysis of Game Machines
Architectural Analysis of Game MachinesArchitectural Analysis of Game Machines
Architectural Analysis of Game Machines
Praveen AP
 
Working together with SURF Raymond Oonk Annette Langedijk SURF
Working together with SURF Raymond Oonk Annette Langedijk SURFWorking together with SURF Raymond Oonk Annette Langedijk SURF
Working together with SURF Raymond Oonk Annette Langedijk SURF
CommunicatieSURF
 
Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0
Simon Elliston Ball
 
Wms Performance Tests Map Server Vs Geo Server
Wms Performance Tests Map Server Vs Geo ServerWms Performance Tests Map Server Vs Geo Server
Wms Performance Tests Map Server Vs Geo Server
DonnyV
 
WMS Performance Shootout 2011
WMS Performance Shootout 2011WMS Performance Shootout 2011
WMS Performance Shootout 2011
Jeff McKenna
 
cnsm2011_slide
cnsm2011_slidecnsm2011_slide
cnsm2011_slide
rerngvit yanggratoke
 
GPU
GPUGPU
Japan Lustre User Group 2014
Japan Lustre User Group 2014Japan Lustre User Group 2014
Japan Lustre User Group 2014
Hitoshi Sato
 
OpenPOWER Application Optimisation meet up
OpenPOWER Application Optimisation meet up OpenPOWER Application Optimisation meet up
OpenPOWER Application Optimisation meet up
Ganesan Narayanasamy
 
mod-geocache / mapcache - A fast tiling solution for the apache web server
mod-geocache / mapcache - A fast tiling solution for the apache web servermod-geocache / mapcache - A fast tiling solution for the apache web server
mod-geocache / mapcache - A fast tiling solution for the apache web server
tbonfort
 
Managing Large Datasets in LabVIEW
Managing Large Datasets in LabVIEWManaging Large Datasets in LabVIEW
Managing Large Datasets in LabVIEW
James McNally
 

What's hot (20)

Tobias Oetiker: RRDtool - how to make it sit up and beg
Tobias Oetiker: RRDtool - how to make it sit up and begTobias Oetiker: RRDtool - how to make it sit up and beg
Tobias Oetiker: RRDtool - how to make it sit up and beg
 
MapServer #ProTips 2015
MapServer #ProTips 2015MapServer #ProTips 2015
MapServer #ProTips 2015
 
Federated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation TherapyFederated HPC Clouds applied to Radiation Therapy
Federated HPC Clouds applied to Radiation Therapy
 
Nodester Architecture overview & roadmap
Nodester Architecture overview & roadmapNodester Architecture overview & roadmap
Nodester Architecture overview & roadmap
 
nodester Architecture overview & roadmap
nodester Architecture overview & roadmapnodester Architecture overview & roadmap
nodester Architecture overview & roadmap
 
Map Analytics in Starcraft II
Map Analytics in Starcraft IIMap Analytics in Starcraft II
Map Analytics in Starcraft II
 
Using GPUs to accelerate nonstiff and stiff chemical kinetics in combustion s...
Using GPUs to accelerate nonstiff and stiff chemical kinetics in combustion s...Using GPUs to accelerate nonstiff and stiff chemical kinetics in combustion s...
Using GPUs to accelerate nonstiff and stiff chemical kinetics in combustion s...
 
Debugging CUDA applications
Debugging CUDA applicationsDebugging CUDA applications
Debugging CUDA applications
 
GPU Programming with CUDA
GPU Programming with CUDAGPU Programming with CUDA
GPU Programming with CUDA
 
Architectural Analysis of Game Machines
Architectural Analysis of Game MachinesArchitectural Analysis of Game Machines
Architectural Analysis of Game Machines
 
Working together with SURF Raymond Oonk Annette Langedijk SURF
Working together with SURF Raymond Oonk Annette Langedijk SURFWorking together with SURF Raymond Oonk Annette Langedijk SURF
Working together with SURF Raymond Oonk Annette Langedijk SURF
 
Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0Riding the Elephant - Hadoop 2.0
Riding the Elephant - Hadoop 2.0
 
Wms Performance Tests Map Server Vs Geo Server
Wms Performance Tests Map Server Vs Geo ServerWms Performance Tests Map Server Vs Geo Server
Wms Performance Tests Map Server Vs Geo Server
 
WMS Performance Shootout 2011
WMS Performance Shootout 2011WMS Performance Shootout 2011
WMS Performance Shootout 2011
 
cnsm2011_slide
cnsm2011_slidecnsm2011_slide
cnsm2011_slide
 
GPU
GPUGPU
GPU
 
Japan Lustre User Group 2014
Japan Lustre User Group 2014Japan Lustre User Group 2014
Japan Lustre User Group 2014
 
OpenPOWER Application Optimisation meet up
OpenPOWER Application Optimisation meet up OpenPOWER Application Optimisation meet up
OpenPOWER Application Optimisation meet up
 
mod-geocache / mapcache - A fast tiling solution for the apache web server
mod-geocache / mapcache - A fast tiling solution for the apache web servermod-geocache / mapcache - A fast tiling solution for the apache web server
mod-geocache / mapcache - A fast tiling solution for the apache web server
 
Managing Large Datasets in LabVIEW
Managing Large Datasets in LabVIEWManaging Large Datasets in LabVIEW
Managing Large Datasets in LabVIEW
 

Similar to Preliminary xsx die_fact_finding

20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
Kohei KaiGai
 
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
Kohei KaiGai
 
Woden 2: Developing a modern 3D graphics engine in Smalltalk
Woden 2: Developing a modern 3D graphics engine in SmalltalkWoden 2: Developing a modern 3D graphics engine in Smalltalk
Woden 2: Developing a modern 3D graphics engine in Smalltalk
ESUG
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Databricks
 
Do Theoretical Flo Ps Matter For Real Application’S Performance Kaust 2012
Do Theoretical Flo Ps Matter For Real Application’S Performance Kaust 2012Do Theoretical Flo Ps Matter For Real Application’S Performance Kaust 2012
Do Theoretical Flo Ps Matter For Real Application’S Performance Kaust 2012
Joshua Mora
 
Oracle Cloud Infrastructure - High Performance Computingのご紹介 [2020年5月版]
Oracle Cloud Infrastructure - High Performance Computingのご紹介 [2020年5月版]Oracle Cloud Infrastructure - High Performance Computingのご紹介 [2020年5月版]
Oracle Cloud Infrastructure - High Performance Computingのご紹介 [2020年5月版]
オラクルエンジニア通信
 
20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference
Kohei KaiGai
 
Analytics DB Benchmark
Analytics DB BenchmarkAnalytics DB Benchmark
Analytics DB Benchmark
Denny Marton
 
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Stefano Di Carlo
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Advances in GPU Computing
Advances in GPU ComputingAdvances in GPU Computing
Advances in GPU Computing
Frédéric Parienté
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Exploiting GPUs in Spark
Exploiting GPUs in SparkExploiting GPUs in Spark
Exploiting GPUs in Spark
Kazuaki Ishizaki
 
Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AI
DataWorks Summit
 
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Nelson Calero
 
NetFlow Data processing using Hadoop and Vertica
NetFlow Data processing using Hadoop and VerticaNetFlow Data processing using Hadoop and Vertica
NetFlow Data processing using Hadoop and Vertica
Josef Niedermeier
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
NVIDIA Japan
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best Practices
Jeff Larkin
 

Similar to Preliminary xsx die_fact_finding (20)

20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
 
PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)PG-Strom v2.0 Technical Brief (17-Apr-2018)
PG-Strom v2.0 Technical Brief (17-Apr-2018)
 
Woden 2: Developing a modern 3D graphics engine in Smalltalk
Woden 2: Developing a modern 3D graphics engine in SmalltalkWoden 2: Developing a modern 3D graphics engine in Smalltalk
Woden 2: Developing a modern 3D graphics engine in Smalltalk
 
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
Running Apache Spark on a High-Performance Cluster Using RDMA and NVMe Flash ...
 
Do Theoretical Flo Ps Matter For Real Application’S Performance Kaust 2012
Do Theoretical Flo Ps Matter For Real Application’S Performance Kaust 2012Do Theoretical Flo Ps Matter For Real Application’S Performance Kaust 2012
Do Theoretical Flo Ps Matter For Real Application’S Performance Kaust 2012
 
Oracle Cloud Infrastructure - High Performance Computingのご紹介 [2020年5月版]
Oracle Cloud Infrastructure - High Performance Computingのご紹介 [2020年5月版]Oracle Cloud Infrastructure - High Performance Computingのご紹介 [2020年5月版]
Oracle Cloud Infrastructure - High Performance Computingのご紹介 [2020年5月版]
 
20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference20181210 - PGconf.ASIA Unconference
20181210 - PGconf.ASIA Unconference
 
Analytics DB Benchmark
Analytics DB BenchmarkAnalytics DB Benchmark
Analytics DB Benchmark
 
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
Multi-faceted Microarchitecture Level Reliability Characterization for NVIDIA...
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Advances in GPU Computing
Advances in GPU ComputingAdvances in GPU Computing
Advances in GPU Computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Exploiting GPUs in Spark
Exploiting GPUs in SparkExploiting GPUs in Spark
Exploiting GPUs in Spark
 
Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AI
 
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
Automate Oracle database patches and upgrades using Fleet Provisioning and Pa...
 
NetFlow Data processing using Hadoop and Vertica
NetFlow Data processing using Hadoop and VerticaNetFlow Data processing using Hadoop and Vertica
NetFlow Data processing using Hadoop and Vertica
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 
XT Best Practices
XT Best PracticesXT Best Practices
XT Best Practices
 

More from mistercteam

20150207 howes-gpgpu8-dark secrets
20150207 howes-gpgpu8-dark secrets20150207 howes-gpgpu8-dark secrets
20150207 howes-gpgpu8-dark secrets
mistercteam
 
S0333 gtc2012-gmac-programming-cuda
S0333 gtc2012-gmac-programming-cudaS0333 gtc2012-gmac-programming-cuda
S0333 gtc2012-gmac-programming-cuda
mistercteam
 
201210 howes-hsa and-the_modern_gpu
201210 howes-hsa and-the_modern_gpu201210 howes-hsa and-the_modern_gpu
201210 howes-hsa and-the_modern_gpu
mistercteam
 
3 673 (1)
3 673 (1)3 673 (1)
3 673 (1)
mistercteam
 
3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)
mistercteam
 
5 baker oxide (1)
5 baker oxide (1)5 baker oxide (1)
5 baker oxide (1)
mistercteam
 
The technology behind_the_elemental_demo_16x9-1248544805
The technology behind_the_elemental_demo_16x9-1248544805The technology behind_the_elemental_demo_16x9-1248544805
The technology behind_the_elemental_demo_16x9-1248544805
mistercteam
 
Lecture14
Lecture14Lecture14
Lecture14
mistercteam
 
01 intro-bps-2011
01 intro-bps-201101 intro-bps-2011
01 intro-bps-2011
mistercteam
 
Gdce 2010 dx11
Gdce 2010 dx11Gdce 2010 dx11
Gdce 2010 dx11
mistercteam
 
Hpg2011 papers kazakov
Hpg2011 papers kazakovHpg2011 papers kazakov
Hpg2011 papers kazakov
mistercteam
 
Dx11 performancereloaded
Dx11 performancereloadedDx11 performancereloaded
Dx11 performancereloaded
mistercteam
 
Mantle programming-guide-and-api-reference
Mantle programming-guide-and-api-referenceMantle programming-guide-and-api-reference
Mantle programming-guide-and-api-reference
mistercteam
 
D3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performanceD3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performance
mistercteam
 
D3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performanceD3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performance
mistercteam
 
Advancements in-tiled-rendering
Advancements in-tiled-renderingAdvancements in-tiled-rendering
Advancements in-tiled-rendering
mistercteam
 
Getting the-best-out-of-d3 d12
Getting the-best-out-of-d3 d12Getting the-best-out-of-d3 d12
Getting the-best-out-of-d3 d12
mistercteam
 

More from mistercteam (17)

20150207 howes-gpgpu8-dark secrets
20150207 howes-gpgpu8-dark secrets20150207 howes-gpgpu8-dark secrets
20150207 howes-gpgpu8-dark secrets
 
S0333 gtc2012-gmac-programming-cuda
S0333 gtc2012-gmac-programming-cudaS0333 gtc2012-gmac-programming-cuda
S0333 gtc2012-gmac-programming-cuda
 
201210 howes-hsa and-the_modern_gpu
201210 howes-hsa and-the_modern_gpu201210 howes-hsa and-the_modern_gpu
201210 howes-hsa and-the_modern_gpu
 
3 673 (1)
3 673 (1)3 673 (1)
3 673 (1)
 
3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)3 boyd direct3_d12 (1)
3 boyd direct3_d12 (1)
 
5 baker oxide (1)
5 baker oxide (1)5 baker oxide (1)
5 baker oxide (1)
 
The technology behind_the_elemental_demo_16x9-1248544805
The technology behind_the_elemental_demo_16x9-1248544805The technology behind_the_elemental_demo_16x9-1248544805
The technology behind_the_elemental_demo_16x9-1248544805
 
Lecture14
Lecture14Lecture14
Lecture14
 
01 intro-bps-2011
01 intro-bps-201101 intro-bps-2011
01 intro-bps-2011
 
Gdce 2010 dx11
Gdce 2010 dx11Gdce 2010 dx11
Gdce 2010 dx11
 
Hpg2011 papers kazakov
Hpg2011 papers kazakovHpg2011 papers kazakov
Hpg2011 papers kazakov
 
Dx11 performancereloaded
Dx11 performancereloadedDx11 performancereloaded
Dx11 performancereloaded
 
Mantle programming-guide-and-api-reference
Mantle programming-guide-and-api-referenceMantle programming-guide-and-api-reference
Mantle programming-guide-and-api-reference
 
D3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performanceD3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performance
 
D3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performanceD3 d12 a-new-meaning-for-efficiency-and-performance
D3 d12 a-new-meaning-for-efficiency-and-performance
 
Advancements in-tiled-rendering
Advancements in-tiled-renderingAdvancements in-tiled-rendering
Advancements in-tiled-rendering
 
Getting the-best-out-of-d3 d12
Getting the-best-out-of-d3 d12Getting the-best-out-of-d3 d12
Getting the-best-out-of-d3 d12
 

Recently uploaded

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 

Recently uploaded (20)

How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 

Preliminary xsx die_fact_finding

  • 1. Preliminary XSX fact finding (it is not about 100% correctness, it is about journey for connecting the dot), so dont use this as a fact!!! plus RDNA1 Slide and Cache data size by @blueisviolet
  • 2. AMD Slide / Navi Fact In Navi, two Compute Engines form a Workgroup Processor, and five of those form an Asynchronous Compute Engine (ACE) important link RDNA1 (Navi10) https://www.amd.com/system/files/documents/rdna- whitepaper.pdf https://gpuopen.com/wp- content/uploads/2019/08/RDNA_Architecture_public.pdf https://gpuopen.com/rdna-shader-instruction-set-architecture- document-now-available/
  • 3. AMD Slide / Navi Fact Some interesting of per SIMD32 block - VGPR (128KB) - SGPR (10KB) also hmmm so ALU provide: x32 ALU + DP unit x2 + Transcedental? unit x8 32KB I$ for 4 SPU ( 2 WGP) 16KB K$ (data cache) per 2 WGP
  • 4. AMD Slide / Navi Fact In Navi, two Compute Engines (CU) form a Workgroup Processor, and five of those form an Asynchronous Compute Engine (ACE)
  • 5. AMD Slide / Navi Fact 1 WGP (2CU)=2xL0(2 x 16KB) + I$ (32KB) + K$ (16KB)+VGPR(512KB)+SGPR(40KB)+128KB LDS 5 WGP (1 Shader Array) (1 ACE) = connect to 128KB L1, 1 CU = 256KB VGPR + 20KB SGPR
  • 6. So Why ? Seems the E F G has similar structure
  • 7. Zoomed, in normal CU, the 4 5 6 7 usually different it is like 10 Group, instead 4 like Navi, it is 6, what we know it seems per WGP can be customized, also remember X1 Hotchip on higher level diagram they showed as per 6 CU
  • 8. Another thing is from Github, it is said 3 GDS and 6 LDS per group. Also provided X1 Hotchip slide (per 6)