Challenges of memory management on modern numa systems

•Download as PPTX, PDF•

0 likes•21 views

SHINTO CHAKKIATH

Technology

What is a NUMA System ??
Non-uniform memory access is a computer memory design used in
multiprocessing, where the memory access time depends on the
memory location relative to the processor. Under NUMA, a processor
can access its own local memory faster than non-local memory.
P indicates PROCESSOR

Main Characteristics
• Consists of several nodes.
• Each node contains a subset of system’s CPU and a part of it
RAM.
• Programs can transparently access memory on local and
remote nodes without changes to the code.

Current CPU can generate an
immense load on the
memory subsystem.
This causes Congestion on
memory controllers and
interconnect links
Accessing a single node by multiple cores
Causes increase in the memory latencies up to
1200 cycles
In a Brief

Local vs Remote Differences For
Single threaded applications
Performance never
degraded by more than 20
percent, even when all
memory requests were
remote.

Local vs Remote Differences For
Multithreaded applications
The figure compares the
two policies by showing
the performance difference
between the best and worst
policy for each benchmark.
(F) Indicates First touch
(l) Indicates Interleave
(-) Indicates negligible difference

The first observation to make is that no one policy is best for all
applications. Several applications perform best with the first-
touch policy, but many prefer interleaving. The second
observation is that NUMA effects beyond the remote-access
penalty can indeed severely affect performance.
Observations

Further Investigated characteristics
 Local Access Ratio
 Memory Latency
 Memory-Controller Imbalance
 Average Interconnect Usage
 Average Interconnect Imbalance
 IPC(Instructions Per Cycle)

Avoiding performance pitfalls on NUMA
systems requires considering how the nodes
are connected, where the program’s
memory is placed, and how it accesses that
memory.
A NUMA memory-management
algorithm should place importance on
congestion management, rather than
focusing solely on reducing remote
accesses.
How to Make it Better !
The effects of imbalance and the local
access ratio are reflected in the memory-
access latency.

Conclusion
NUMA architecture is for scaling the processor count of
today's server-class systems. In the near future, expect
systems to have even more NUMA nodes and more
complicated NUMA topologies . The two NUMA concerns of
congestion and locality are hard to reconcile, and for any
particular application we can't know the best memory
placement beforehand.

Similar to Challenges of memory management on modern numa systems

Overview of Distributed Systemsvampugani

Mca2010 – operating systemsmumbahelp

Basic features of distributed systemsatish raj

Chap2 slidesashishmulchandani

PGAS Programming Modelch adnan

THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDL...cscpconf

The effective way of processor performance enhancement by proper branch handlingcsandit

CS8603_Notes_003-1_edubuzz360.pdfKishaKiddo

Module 4 memory managementSweta Kumari Barnwal

Lecture 6Mr SMAK

Types of operating systemMohammad Alam

lecture 1 (Part 2) kernal and its categoriesWajeehaBaig

Oslecture1kausik23

A Parallel Computing-a Paradigm to achieve High PerformanceAM Publications

Computer's clasificationMayraChF

Symmetric multiprocessing and MicrokernelManoraj Pannerselum

DesignCon 2015-criticalmemoryperformancemetricsforDDR4Barbara Aichinger

Similar to Challenges of memory management on modern numa systems (20)

Overview of Distributed Systems

Mca2010 – operating system

Basic features of distributed system

Chap2 slides

PGAS Programming Model

THE EFFECTIVE WAY OF PROCESSOR PERFORMANCE ENHANCEMENT BY PROPER BRANCH HANDL...

The effective way of processor performance enhancement by proper branch handling

CS8603_Notes_003-1_edubuzz360.pdf

Module 4 memory management

Lecture 6

Types of operating system

lecture 1 (Part 2) kernal and its categories

Oslecture1

A Parallel Computing-a Paradigm to achieve High Performance

Computer's clasification

Symmetric multiprocessing and Microkernel

DesignCon 2015-criticalmemoryperformancemetricsforDDR4

Recently uploaded

Syngulon - Selection technology May 2024.pdfSyngulon

AI mind or machine power point presentationyogeshlabana357357

Portal Kombat : extension du réseau de propagande russe中央社

Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies

Revolutionizing SAP® Processes with Automation and Artificial IntelligencePrecisely

Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance

2024 May Patch TuesdayIvanti

ADP Passwordless Journey Case Study.pptxFIDO Alliance

Google I/O Extended 2024 WarsawGDSC PJATK

Intro to Passkeys and the State of Passwordless.pptxFIDO Alliance

“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdfMuhammad Subhan

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc

Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfAnubhavMangla3

Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Paige Cruz

TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...marcuskenyatta275

WebAssembly is Key to Better LLM PerformanceSamy Fodil

Design Guidelines for Passkeys 2024.pptxFIDO Alliance

Working together SRE & Platform EngineeringMarcus Vechiato

ChatGPT and Beyond - Elevating DevOps ProductivityVictorSzoltysek

State of the Smart Building Startup Landscape 2024!Memoori

Recently uploaded (20)

Syngulon - Selection technology May 2024.pdf

AI mind or machine power point presentation

Portal Kombat : extension du réseau de propagande russe

Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...

Revolutionizing SAP® Processes with Automation and Artificial Intelligence

Hyatt driving innovation and exceptional customer experiences with FIDO passw...

2024 May Patch Tuesday

ADP Passwordless Journey Case Study.pptx

Google I/O Extended 2024 Warsaw

Intro to Passkeys and the State of Passwordless.pptx

“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...

Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf

Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)

TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...

WebAssembly is Key to Better LLM Performance

Design Guidelines for Passkeys 2024.pptx

Working together SRE & Platform Engineering

ChatGPT and Beyond - Elevating DevOps Productivity

State of the Smart Building Startup Landscape 2024!

Challenges of memory management on modern numa systems

1. CHALLENGES OF MEMORY MANAGEMENT ON Modern NUMA Systems Student Name Anwin Varghese Student ID 191ADB095

2. What is a NUMA System ?? Non-uniform memory access is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory faster than non-local memory. P indicates PROCESSOR

3. Main Characteristics • Consists of several nodes. • Each node contains a subset of system’s CPU and a part of it RAM. • Programs can transparently access memory on local and remote nodes without changes to the code.

4. Current CPU can generate an immense load on the memory subsystem. This causes Congestion on memory controllers and interconnect links Accessing a single node by multiple cores Causes increase in the memory latencies up to 1200 cycles In a Brief

5. Local vs Remote Differences For Single threaded applications Performance never degraded by more than 20 percent, even when all memory requests were remote.

6. Local vs Remote Differences For Multithreaded applications The figure compares the two policies by showing the performance difference between the best and worst policy for each benchmark. (F) Indicates First touch (l) Indicates Interleave (-) Indicates negligible difference

7. The first observation to make is that no one policy is best for all applications. Several applications perform best with the first- touch policy, but many prefer interleaving. The second observation is that NUMA effects beyond the remote-access penalty can indeed severely affect performance. Observations

8. Further Investigated characteristics  Local Access Ratio  Memory Latency  Memory-Controller Imbalance  Average Interconnect Usage  Average Interconnect Imbalance  IPC(Instructions Per Cycle)

9. Avoiding performance pitfalls on NUMA systems requires considering how the nodes are connected, where the program’s memory is placed, and how it accesses that memory. A NUMA memory-management algorithm should place importance on congestion management, rather than focusing solely on reducing remote accesses. How to Make it Better ! The effects of imbalance and the local access ratio are reflected in the memory- access latency.

10. Conclusion NUMA architecture is for scaling the processor count of today's server-class systems. In the near future, expect systems to have even more NUMA nodes and more complicated NUMA topologies . The two NUMA concerns of congestion and locality are hard to reconcile, and for any particular application we can't know the best memory placement beforehand.

Challenges of memory management on modern numa systems

Recommended

Recommended

More Related Content

Similar to Challenges of memory management on modern numa systems

Similar to Challenges of memory management on modern numa systems (20)

More from SHINTO CHAKKIATH

More from SHINTO CHAKKIATH (7)

Recently uploaded

Recently uploaded (20)

Challenges of memory management on modern numa systems