SlideShare a Scribd company logo
Cache Performance Evaluation
for Multiprocessor System
Drs. Alfred Mutanga
Management Information Systems Specialist
University of Venda
Date: 17 November 2013
Venue: Novotel Hotel, World Trade Centre, Dubai, UAE
Why a Cache Performance Evaluation System?

????
The Memory Wall?
– Memory and Processor Speeds
• Processors Speeds rising dramatically at 75%/p.a.
• Memory clock speeds at a paltry 7% p.a.

– Result: A divergence in the Operating Speeds
Research Questions
1. To what extend do the number of processors in
multiprocessor architectures affect the
performance of level one (L1) data Cache
Memory Systems?
2. How do cache coherency protocols influence the
Level-1 Data Cache memory performances of
multiprocessor architectures?
Theoretical Framework
The Challenges of Multi-core architectures?
– Programmability
– Scalability
– Communications
– Management of heterogeneous architectures

– Cache Memory Systems
– Attempts to increase memory bandwidth by

introducing concurrency in memory access
– Required regular memory access patterns – resulted
in degradation in memory performance
Memory Hierarchy

Architectural issues of Memory Hierarchy?
– Brings conflicting requirements in the memory systems
• Computing systems require a large and fast memory to scale
up performances

– MH attempts to make slow memory appear fast by
buffering data into smaller faster memories close to
CPUs

– Electronic Systems: Slow down as they increase in

size (compromise between power and performance)

– Most common solution to Memory wall is to cache
data
Research Methodology
CPU

CACHE

CPU

CPU

CACHE

CACHE

INTERCONNECTION NETWORK

CPU

CACHE

• Linux Environment – Arch
Linux
• SystemC
• Memory Trace Files
• Fast Fourier Trace Files
• Random Trace Files
• Debugging Trace Files

• Distributed Shared Memory
System
• Cache Coherence Protocols
• Snoopy (Valid-Invalid)
• Directory based (MOESI)

• Cache Memory
MEMORY

• 32KB Level-1 Data Cache
• 32 Byte line
Design and Implementation in SystemC
• Memory Module-simulated the Shared bulk (RAM)
• CPU Module-has to connect to the other modules such
as the cache, and memory using the appropriate ports
• Cache Module- defined the Cache properties and
macros that were used throughout the simulation
• Simple Bus Module- connected to the different address
ports in the cache using an appropriate bus signal
• Cache Helper Libraries- represented files that collected
the traces of the memory requests during each program
execution
Average Hit Rate Using Random Traces
Average Hit Rate Using Fast-Fourier
Transform Traces
Average Bus Contention Using Random
Traces
Average Bus Contention Using Fast-Fourier
Transform Traces
Conclusions
• Write-invalidate-needs management of dynamic
requests
• Execution time-increases with number of processors
• Snooping- has a direct effect on cache
• Synchronization of caches and optimizations in the
compiler- can increase cache performance
• Cache Coherency protocols- directory based cache
coherency protocols have a slight performance edge
over Snooping cache coherency protocols
Acknowledgements
•
•
•
•
•

Jesshope, C(2008,2009,2011)-Trace files
Bhasker, J. (2009)-SystemCTM Primer
OSCI- SystemC Libraries
AMD-64 Programming manual- MOESI Protocol
Hennessy, L., J. and Patterson, A., D. (2007)- Computer
Architecture: A Quantitative Approach
• Etc.
Thank you

More Related Content

What's hot

Shared memory.pptx
Shared memory.pptxShared memory.pptx
Shared memory.pptx
KomboreroChiweshe
 
Coherence and consistency models in multiprocessor architecture
Coherence and consistency models in multiprocessor architectureCoherence and consistency models in multiprocessor architecture
Coherence and consistency models in multiprocessor architecture
University of Pisa
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
Neel Patel
 
Moving to moodle 2
Moving to moodle 2Moving to moodle 2
Moving to moodle 2
JISC RSC Eastern
 
Symmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelSymmetric multiprocessing and Microkernel
Symmetric multiprocessing and Microkernel
Manoraj Pannerselum
 
Paralle programming 2
Paralle programming 2Paralle programming 2
Paralle programming 2Anshul Sharma
 
Parallel Processing Presentation2
Parallel Processing Presentation2Parallel Processing Presentation2
Parallel Processing Presentation2daniyalqureshi712
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
Kamal Acharya
 
Smp and asmp architecture.
Smp and asmp architecture.Smp and asmp architecture.
Smp and asmp architecture.
Gaurav Dalvi
 
Multiprocessor structures
Multiprocessor structuresMultiprocessor structures
Multiprocessor structures
Shareb Ismaeel
 
Micro kernel
Micro kernelMicro kernel
Micro kernel
DarakhshanNayyab
 
Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessorKishan Panara
 
Symmetric multiprocessing (smp)
Symmetric multiprocessing (smp)Symmetric multiprocessing (smp)
Symmetric multiprocessing (smp)
rayhan basher
 
Lecture 6
Lecture  6Lecture  6
Lecture 6Mr SMAK
 

What's hot (20)

Shared memory.pptx
Shared memory.pptxShared memory.pptx
Shared memory.pptx
 
Coherence and consistency models in multiprocessor architecture
Coherence and consistency models in multiprocessor architectureCoherence and consistency models in multiprocessor architecture
Coherence and consistency models in multiprocessor architecture
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Moving to moodle 2
Moving to moodle 2Moving to moodle 2
Moving to moodle 2
 
Symmetric multiprocessing and Microkernel
Symmetric multiprocessing and MicrokernelSymmetric multiprocessing and Microkernel
Symmetric multiprocessing and Microkernel
 
Lecture4
Lecture4Lecture4
Lecture4
 
Paralle programming 2
Paralle programming 2Paralle programming 2
Paralle programming 2
 
Parallel Processing Presentation2
Parallel Processing Presentation2Parallel Processing Presentation2
Parallel Processing Presentation2
 
Multiprocessor
MultiprocessorMultiprocessor
Multiprocessor
 
Smp and asmp architecture.
Smp and asmp architecture.Smp and asmp architecture.
Smp and asmp architecture.
 
Parallel processing extra
Parallel processing extraParallel processing extra
Parallel processing extra
 
Lecture6
Lecture6Lecture6
Lecture6
 
Lecture1
Lecture1Lecture1
Lecture1
 
Week5
Week5Week5
Week5
 
Lecture5
Lecture5Lecture5
Lecture5
 
Multiprocessor structures
Multiprocessor structuresMultiprocessor structures
Multiprocessor structures
 
Micro kernel
Micro kernelMicro kernel
Micro kernel
 
Multivector and multiprocessor
Multivector and multiprocessorMultivector and multiprocessor
Multivector and multiprocessor
 
Symmetric multiprocessing (smp)
Symmetric multiprocessing (smp)Symmetric multiprocessing (smp)
Symmetric multiprocessing (smp)
 
Lecture 6
Lecture  6Lecture  6
Lecture 6
 

Similar to Cache Performance Evaluation

parallel processing.ppt
parallel processing.pptparallel processing.ppt
parallel processing.ppt
NANDHINIS109942
 
chapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).pptchapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).ppt
NANDHINIS109942
 
22CS201 COA
22CS201 COA22CS201 COA
22CS201 COA
Kathirvel Ayyaswamy
 
parallel-processing.ppt
parallel-processing.pptparallel-processing.ppt
parallel-processing.ppt
MohammedAbdelgader2
 
18 parallel processing
18 parallel processing18 parallel processing
18 parallel processing
dilip kumar
 
Parallel processing
Parallel processingParallel processing
Parallel processing
Syed Zaid Irshad
 
Intro_ppt.pptx
Intro_ppt.pptxIntro_ppt.pptx
Intro_ppt.pptx
ssuser906c831
 
OS_MD_4.pdf
OS_MD_4.pdfOS_MD_4.pdf
OS_MD_4.pdf
SangeethaBS4
 
Factored operating systems
Factored operating systemsFactored operating systems
Factored operating systems
Indika Munaweera Kankanamge
 
Parallel processing
Parallel processingParallel processing
Parallel processing
Shivalik college of engineering
 
assignment_presentaion_jhvvnvhjhbhjhvjh.pptx
assignment_presentaion_jhvvnvhjhbhjhvjh.pptxassignment_presentaion_jhvvnvhjhbhjhvjh.pptx
assignment_presentaion_jhvvnvhjhbhjhvjh.pptx
23mu36
 
Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler
Sarwan ali
 
GEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use CasesGEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use Cases
inside-BigData.com
 
Lecture-7 Main Memroy.pptx
Lecture-7 Main Memroy.pptxLecture-7 Main Memroy.pptx
Lecture-7 Main Memroy.pptx
Amanuelmergia
 
Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Lecture 1 (distributed systems)
Lecture 1 (distributed systems)
Fazli Amin
 
Memory Management in Operating Systems for all
Memory Management in Operating Systems for allMemory Management in Operating Systems for all
Memory Management in Operating Systems for all
VSKAMCSPSGCT
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
Mohsin Bhat
 
Aca module 1
Aca module 1Aca module 1
Aca module 1
Avinash_N Rao
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
Andriy Zabavskyy
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
BaliThorat1
 

Similar to Cache Performance Evaluation (20)

parallel processing.ppt
parallel processing.pptparallel processing.ppt
parallel processing.ppt
 
chapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).pptchapter-18-parallel-processing-multiprocessing (1).ppt
chapter-18-parallel-processing-multiprocessing (1).ppt
 
22CS201 COA
22CS201 COA22CS201 COA
22CS201 COA
 
parallel-processing.ppt
parallel-processing.pptparallel-processing.ppt
parallel-processing.ppt
 
18 parallel processing
18 parallel processing18 parallel processing
18 parallel processing
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
Intro_ppt.pptx
Intro_ppt.pptxIntro_ppt.pptx
Intro_ppt.pptx
 
OS_MD_4.pdf
OS_MD_4.pdfOS_MD_4.pdf
OS_MD_4.pdf
 
Factored operating systems
Factored operating systemsFactored operating systems
Factored operating systems
 
Parallel processing
Parallel processingParallel processing
Parallel processing
 
assignment_presentaion_jhvvnvhjhbhjhvjh.pptx
assignment_presentaion_jhvvnvhjhbhjhvjh.pptxassignment_presentaion_jhvvnvhjhbhjhvjh.pptx
assignment_presentaion_jhvvnvhjhbhjhvjh.pptx
 
Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler
 
GEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use CasesGEN-Z: An Overview and Use Cases
GEN-Z: An Overview and Use Cases
 
Lecture-7 Main Memroy.pptx
Lecture-7 Main Memroy.pptxLecture-7 Main Memroy.pptx
Lecture-7 Main Memroy.pptx
 
Lecture 1 (distributed systems)
Lecture 1 (distributed systems)Lecture 1 (distributed systems)
Lecture 1 (distributed systems)
 
Memory Management in Operating Systems for all
Memory Management in Operating Systems for allMemory Management in Operating Systems for all
Memory Management in Operating Systems for all
 
Parallel Computing
Parallel ComputingParallel Computing
Parallel Computing
 
Aca module 1
Aca module 1Aca module 1
Aca module 1
 
A Closer Look at Apache Kudu
A Closer Look at Apache KuduA Closer Look at Apache Kudu
A Closer Look at Apache Kudu
 
Chap2 slides
Chap2 slidesChap2 slides
Chap2 slides
 

Recently uploaded

FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 

Cache Performance Evaluation

  • 1.
  • 2. Cache Performance Evaluation for Multiprocessor System Drs. Alfred Mutanga Management Information Systems Specialist University of Venda Date: 17 November 2013 Venue: Novotel Hotel, World Trade Centre, Dubai, UAE
  • 3. Why a Cache Performance Evaluation System? ???? The Memory Wall? – Memory and Processor Speeds • Processors Speeds rising dramatically at 75%/p.a. • Memory clock speeds at a paltry 7% p.a. – Result: A divergence in the Operating Speeds
  • 4. Research Questions 1. To what extend do the number of processors in multiprocessor architectures affect the performance of level one (L1) data Cache Memory Systems? 2. How do cache coherency protocols influence the Level-1 Data Cache memory performances of multiprocessor architectures?
  • 5. Theoretical Framework The Challenges of Multi-core architectures? – Programmability – Scalability – Communications – Management of heterogeneous architectures – Cache Memory Systems – Attempts to increase memory bandwidth by introducing concurrency in memory access – Required regular memory access patterns – resulted in degradation in memory performance
  • 6. Memory Hierarchy Architectural issues of Memory Hierarchy? – Brings conflicting requirements in the memory systems • Computing systems require a large and fast memory to scale up performances – MH attempts to make slow memory appear fast by buffering data into smaller faster memories close to CPUs – Electronic Systems: Slow down as they increase in size (compromise between power and performance) – Most common solution to Memory wall is to cache data
  • 7. Research Methodology CPU CACHE CPU CPU CACHE CACHE INTERCONNECTION NETWORK CPU CACHE • Linux Environment – Arch Linux • SystemC • Memory Trace Files • Fast Fourier Trace Files • Random Trace Files • Debugging Trace Files • Distributed Shared Memory System • Cache Coherence Protocols • Snoopy (Valid-Invalid) • Directory based (MOESI) • Cache Memory MEMORY • 32KB Level-1 Data Cache • 32 Byte line
  • 8. Design and Implementation in SystemC • Memory Module-simulated the Shared bulk (RAM) • CPU Module-has to connect to the other modules such as the cache, and memory using the appropriate ports • Cache Module- defined the Cache properties and macros that were used throughout the simulation • Simple Bus Module- connected to the different address ports in the cache using an appropriate bus signal • Cache Helper Libraries- represented files that collected the traces of the memory requests during each program execution
  • 9. Average Hit Rate Using Random Traces
  • 10. Average Hit Rate Using Fast-Fourier Transform Traces
  • 11. Average Bus Contention Using Random Traces
  • 12. Average Bus Contention Using Fast-Fourier Transform Traces
  • 13. Conclusions • Write-invalidate-needs management of dynamic requests • Execution time-increases with number of processors • Snooping- has a direct effect on cache • Synchronization of caches and optimizations in the compiler- can increase cache performance • Cache Coherency protocols- directory based cache coherency protocols have a slight performance edge over Snooping cache coherency protocols
  • 14. Acknowledgements • • • • • Jesshope, C(2008,2009,2011)-Trace files Bhasker, J. (2009)-SystemCTM Primer OSCI- SystemC Libraries AMD-64 Programming manual- MOESI Protocol Hennessy, L., J. and Patterson, A., D. (2007)- Computer Architecture: A Quantitative Approach • Etc.