SlideShare a Scribd company logo
1 of 21
Can You Get Performance from
Xeon Phi Easily?
Lessons Learned from Two Real
Cases
Objective
• Check the amount of work to use Intel
Xeon Phi.
• Minimal modifications using only pragmas.
• Two applications:
– CalcunetW. Test MKL Libraries.
– GammaMaps. Test pragmas.
• Two modes:
– Native: Only compiled to execute on Xeon Phi
– Offload: Uses Host+Xeon Phi
CalcuNetw: Calculate Measurements in Complex Networks
• Complex networks, consisting of sets of
nodes or vertices joined together in pairs by
links or edges.
• Application Calculates for each network:
– Subgraph Centrality (SC): characterizes the
participation of each node in all subgraphs in a
network.
– SC odd: account only paths of long odd
– SC even: account only paths of long even
– Bipartivity: Is a proportion of even to total number of
closed walks in the network.
– Network Communicability for Connected Nodes:
C(p,q): Measures how well communicated are two
nodes in the network.
– Network Communicability C(G): is the mean of all
the C(p,q),
Mouriño J.C., Estrada E., Gomez A. “ CalcuNetw: Calculate Measurements in Complex Networks ”,Informe Técnico
CESGA-2005-003
CalcuNetW
GammaMaps: A figure-of-merit in Radiation
Therapy
X
Y
Z
Dose in voxel i,j,k
X
Y
Z
GammaMaps: A figure-of-merit in
Radiation Therapy
Read
Doses
Initialise and
normalise
Compute
Gamma
Store
Gamma
• Application in FORTRAN 90
• Parallelised using OpenMP
• Geometric algorithm*
• 512 x 512 x 128 = 33,554,432
voxels
• Auto-vectorization
• Pragmas for offload
* T. Ju, T. Simpson, J. O. Deasy, and D. A. Low, “Geometric interpretation of the γ dose distribution
comparison technique: Interpolation-free calculation,” Medical Physics, vol. 35, no. 3, p. 879, 2008.
Results of Experiments
Platform
Host
CPU Model Intel(R) Xeon(R) CPU E5-2680
0 @ 2.70GHz
Nr. of cores 16
Memory 32788 MB
Operating System Linux 2.6.32-279.el6.x86_64
Compiler Version 2013U2 Intel Xeon Phi
Model Beta0 Engineering Sample
Nr. of cores 61 at 1.09GHz
Memory 7936 MB
Operating System MPSS Gold U1
Compiler Version 2013U2
GDDR Technology GDDR5
GDDR Frecuency 2750000 KHz
• Remote
access to
Intel systems
• Feb. 2013
COMPACT - FINE
C1 C2 C3 C4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
0 1 2 3 4 5 6 7
Intel Xeon Phi Affinity Policies
SCATTER - FINE
C1 C2 C3 C4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
0 4 1 5 2 6 3 7
BALANCED - FINE
C1 C2 C3 C4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
0 1 2 3 4 5 6 7
BALANCED - CORE
C1 C2 C3 C4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
H
T
1
H
T
2
H
T
3
H
T
4
{0,1} {2,3} {4,5} {6,7}
• TYPE
– Compact
– Scatter
– Balanced
• Granularity
– Fine or Thread
– Core
Results for CalcunetW
CalcunetW
CalcunetW
CalcunetW
Results for GammaMaps
GammaMaps
Host
0
200
400
600
800
1000
1200
1400
0 5 10 15 20
ElapsedTime(s)
Nr. of Threads
Host
local-compact-core
local-compact-fine
local-scatter-fine
local-scatter-core
GammaMaps
Xeon Phi poor I/O
Conclusions
• Using MKL library is easy and does not
require changes in the code.
• Easy pragmas on code permit fast usage
• I/O performance issues in Xeon Phi
• 1 Xeon Phi ~ 1 Xeon E5-2680
• Improve performance requires additional
work.
Acknowledge
The authors would like to thank Intel for
providing access to Intel Xeon Phi
coprocessor.
Questions
Andrés Gómez
José Carlos Mouriño
Carmen Cotelo
Aurelio Rodríguez
The TEAM

More Related Content

Similar to Getting Performance from Xeon Phi Easily

Performance analysis and implementation of modified sdm based noc for mpsoc o...
Performance analysis and implementation of modified sdm based noc for mpsoc o...Performance analysis and implementation of modified sdm based noc for mpsoc o...
Performance analysis and implementation of modified sdm based noc for mpsoc o...eSAT Journals
 
Secure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfigurationSecure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfigurationeSAT Journals
 
Secure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfigurationSecure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfigurationeSAT Publishing House
 
Implementation of resource sharing strategy for power optimization in embedde...
Implementation of resource sharing strategy for power optimization in embedde...Implementation of resource sharing strategy for power optimization in embedde...
Implementation of resource sharing strategy for power optimization in embedde...Alexander Decker
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...
Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...
Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...Jaipal Dhobale
 
Investigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachInvestigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachIJERA Editor
 
Investigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachInvestigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachIJERA Editor
 
IRJET- Re-Configuration Topology for On-Chip Networks by Back-Tracking
IRJET- Re-Configuration Topology for On-Chip Networks by Back-TrackingIRJET- Re-Configuration Topology for On-Chip Networks by Back-Tracking
IRJET- Re-Configuration Topology for On-Chip Networks by Back-TrackingIRJET Journal
 
Optimal configuration of network
Optimal configuration of networkOptimal configuration of network
Optimal configuration of networkjpstudcorner
 
Blue gene detail journal
Blue gene detail journalBlue gene detail journal
Blue gene detail journalVivek Jha
 
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...OPAL-RT TECHNOLOGIES
 
Optimal and Power Aware BIST for Delay Testing of System-On-Chip
Optimal and Power Aware BIST for Delay Testing of System-On-ChipOptimal and Power Aware BIST for Delay Testing of System-On-Chip
Optimal and Power Aware BIST for Delay Testing of System-On-ChipIDES Editor
 
Enhanced Leach Protocol
Enhanced Leach ProtocolEnhanced Leach Protocol
Enhanced Leach Protocolijceronline
 
A ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIP
A ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIPA ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIP
A ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIPijaceeejournal
 
underground cable fault location using aruino,gsm&gps
underground cable fault location using aruino,gsm&gps underground cable fault location using aruino,gsm&gps
underground cable fault location using aruino,gsm&gps Mohd Sohail
 
IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...
IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...
IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...IRJET Journal
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialmadhuinturi
 
Modification of l3 learning switch code for firewall functionality in pox con...
Modification of l3 learning switch code for firewall functionality in pox con...Modification of l3 learning switch code for firewall functionality in pox con...
Modification of l3 learning switch code for firewall functionality in pox con...eSAT Journals
 

Similar to Getting Performance from Xeon Phi Easily (20)

Performance analysis and implementation of modified sdm based noc for mpsoc o...
Performance analysis and implementation of modified sdm based noc for mpsoc o...Performance analysis and implementation of modified sdm based noc for mpsoc o...
Performance analysis and implementation of modified sdm based noc for mpsoc o...
 
Secure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfigurationSecure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfiguration
 
Secure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfigurationSecure remote protocol for fpga reconfiguration
Secure remote protocol for fpga reconfiguration
 
Implementation of resource sharing strategy for power optimization in embedde...
Implementation of resource sharing strategy for power optimization in embedde...Implementation of resource sharing strategy for power optimization in embedde...
Implementation of resource sharing strategy for power optimization in embedde...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...
Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...
Wired and Wireless Computer Network Performance Evaluation Using OMNeT++ Simu...
 
Investigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachInvestigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing Approach
 
Investigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachInvestigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing Approach
 
IRJET- Re-Configuration Topology for On-Chip Networks by Back-Tracking
IRJET- Re-Configuration Topology for On-Chip Networks by Back-TrackingIRJET- Re-Configuration Topology for On-Chip Networks by Back-Tracking
IRJET- Re-Configuration Topology for On-Chip Networks by Back-Tracking
 
Optimal configuration of network
Optimal configuration of networkOptimal configuration of network
Optimal configuration of network
 
Blue gene detail journal
Blue gene detail journalBlue gene detail journal
Blue gene detail journal
 
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...RT15 Berkeley |  ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
RT15 Berkeley | ARTEMiS-SSN Features for Micro-grid / Renewable Energy Sourc...
 
Optimal and Power Aware BIST for Delay Testing of System-On-Chip
Optimal and Power Aware BIST for Delay Testing of System-On-ChipOptimal and Power Aware BIST for Delay Testing of System-On-Chip
Optimal and Power Aware BIST for Delay Testing of System-On-Chip
 
Enhanced Leach Protocol
Enhanced Leach ProtocolEnhanced Leach Protocol
Enhanced Leach Protocol
 
A ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIP
A ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIPA ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIP
A ULTRA-LOW POWER ROUTER DESIGN FOR NETWORK ON CHIP
 
underground cable fault location using aruino,gsm&gps
underground cable fault location using aruino,gsm&gps underground cable fault location using aruino,gsm&gps
underground cable fault location using aruino,gsm&gps
 
blue gene ppt
blue gene pptblue gene ppt
blue gene ppt
 
IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...
IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...
IRJET- An Enhanced Cluster (CH-LEACH) based Routing Scheme for Wireless Senso...
 
Maxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorialMaxwell siuc hpc_description_tutorial
Maxwell siuc hpc_description_tutorial
 
Modification of l3 learning switch code for firewall functionality in pox con...
Modification of l3 learning switch code for firewall functionality in pox con...Modification of l3 learning switch code for firewall functionality in pox con...
Modification of l3 learning switch code for firewall functionality in pox con...
 

More from Andrés Gómez

Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2Andrés Gómez
 
HPC on Cloud for SMEs. The case of bolt tightening.
HPC on Cloud for SMEs. The case of bolt tightening.HPC on Cloud for SMEs. The case of bolt tightening.
HPC on Cloud for SMEs. The case of bolt tightening.Andrés Gómez
 
A Web-platform for radiotherapy, a new workflow concept and an information sh...
A Web-platform for radiotherapy, a new workflow concept and an information sh...A Web-platform for radiotherapy, a new workflow concept and an information sh...
A Web-platform for radiotherapy, a new workflow concept and an information sh...Andrés Gómez
 
Federated HPC Clouds Applied to Radiation Therapy
Federated HPC Clouds Applied to Radiation TherapyFederated HPC Clouds Applied to Radiation Therapy
Federated HPC Clouds Applied to Radiation TherapyAndrés Gómez
 
Software libre y modelos de programación en la investigación con supercomputa...
Software libre y modelos de programación en la investigación con supercomputa...Software libre y modelos de programación en la investigación con supercomputa...
Software libre y modelos de programación en la investigación con supercomputa...Andrés Gómez
 
Role of public supercomputing centers in the promotion of HPC on Cloud: the C...
Role of public supercomputing centers in the promotion of HPC on Cloud: the C...Role of public supercomputing centers in the promotion of HPC on Cloud: the C...
Role of public supercomputing centers in the promotion of HPC on Cloud: the C...Andrés Gómez
 
VCOC BonFIRE presentation at FIRE Engineering Workshop 2012
VCOC BonFIRE presentation at FIRE Engineering Workshop 2012VCOC BonFIRE presentation at FIRE Engineering Workshop 2012
VCOC BonFIRE presentation at FIRE Engineering Workshop 2012Andrés Gómez
 

More from Andrés Gómez (7)

Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2Infraestructuras data science_portugal_ipca_industry_4.0_v2
Infraestructuras data science_portugal_ipca_industry_4.0_v2
 
HPC on Cloud for SMEs. The case of bolt tightening.
HPC on Cloud for SMEs. The case of bolt tightening.HPC on Cloud for SMEs. The case of bolt tightening.
HPC on Cloud for SMEs. The case of bolt tightening.
 
A Web-platform for radiotherapy, a new workflow concept and an information sh...
A Web-platform for radiotherapy, a new workflow concept and an information sh...A Web-platform for radiotherapy, a new workflow concept and an information sh...
A Web-platform for radiotherapy, a new workflow concept and an information sh...
 
Federated HPC Clouds Applied to Radiation Therapy
Federated HPC Clouds Applied to Radiation TherapyFederated HPC Clouds Applied to Radiation Therapy
Federated HPC Clouds Applied to Radiation Therapy
 
Software libre y modelos de programación en la investigación con supercomputa...
Software libre y modelos de programación en la investigación con supercomputa...Software libre y modelos de programación en la investigación con supercomputa...
Software libre y modelos de programación en la investigación con supercomputa...
 
Role of public supercomputing centers in the promotion of HPC on Cloud: the C...
Role of public supercomputing centers in the promotion of HPC on Cloud: the C...Role of public supercomputing centers in the promotion of HPC on Cloud: the C...
Role of public supercomputing centers in the promotion of HPC on Cloud: the C...
 
VCOC BonFIRE presentation at FIRE Engineering Workshop 2012
VCOC BonFIRE presentation at FIRE Engineering Workshop 2012VCOC BonFIRE presentation at FIRE Engineering Workshop 2012
VCOC BonFIRE presentation at FIRE Engineering Workshop 2012
 

Recently uploaded

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Recently uploaded (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Getting Performance from Xeon Phi Easily

  • 1. Can You Get Performance from Xeon Phi Easily? Lessons Learned from Two Real Cases
  • 2. Objective • Check the amount of work to use Intel Xeon Phi. • Minimal modifications using only pragmas. • Two applications: – CalcunetW. Test MKL Libraries. – GammaMaps. Test pragmas. • Two modes: – Native: Only compiled to execute on Xeon Phi – Offload: Uses Host+Xeon Phi
  • 3. CalcuNetw: Calculate Measurements in Complex Networks • Complex networks, consisting of sets of nodes or vertices joined together in pairs by links or edges. • Application Calculates for each network: – Subgraph Centrality (SC): characterizes the participation of each node in all subgraphs in a network. – SC odd: account only paths of long odd – SC even: account only paths of long even – Bipartivity: Is a proportion of even to total number of closed walks in the network. – Network Communicability for Connected Nodes: C(p,q): Measures how well communicated are two nodes in the network. – Network Communicability C(G): is the mean of all the C(p,q), Mouriño J.C., Estrada E., Gomez A. “ CalcuNetw: Calculate Measurements in Complex Networks ”,Informe Técnico CESGA-2005-003
  • 5. GammaMaps: A figure-of-merit in Radiation Therapy X Y Z Dose in voxel i,j,k X Y Z
  • 6. GammaMaps: A figure-of-merit in Radiation Therapy Read Doses Initialise and normalise Compute Gamma Store Gamma • Application in FORTRAN 90 • Parallelised using OpenMP • Geometric algorithm* • 512 x 512 x 128 = 33,554,432 voxels • Auto-vectorization • Pragmas for offload * T. Ju, T. Simpson, J. O. Deasy, and D. A. Low, “Geometric interpretation of the γ dose distribution comparison technique: Interpolation-free calculation,” Medical Physics, vol. 35, no. 3, p. 879, 2008.
  • 8. Platform Host CPU Model Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz Nr. of cores 16 Memory 32788 MB Operating System Linux 2.6.32-279.el6.x86_64 Compiler Version 2013U2 Intel Xeon Phi Model Beta0 Engineering Sample Nr. of cores 61 at 1.09GHz Memory 7936 MB Operating System MPSS Gold U1 Compiler Version 2013U2 GDDR Technology GDDR5 GDDR Frecuency 2750000 KHz • Remote access to Intel systems • Feb. 2013
  • 9. COMPACT - FINE C1 C2 C3 C4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 0 1 2 3 4 5 6 7 Intel Xeon Phi Affinity Policies SCATTER - FINE C1 C2 C3 C4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 0 4 1 5 2 6 3 7 BALANCED - FINE C1 C2 C3 C4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 0 1 2 3 4 5 6 7 BALANCED - CORE C1 C2 C3 C4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 H T 1 H T 2 H T 3 H T 4 {0,1} {2,3} {4,5} {6,7} • TYPE – Compact – Scatter – Balanced • Granularity – Fine or Thread – Core
  • 16. Host 0 200 400 600 800 1000 1200 1400 0 5 10 15 20 ElapsedTime(s) Nr. of Threads Host local-compact-core local-compact-fine local-scatter-fine local-scatter-core
  • 19. Conclusions • Using MKL library is easy and does not require changes in the code. • Easy pragmas on code permit fast usage • I/O performance issues in Xeon Phi • 1 Xeon Phi ~ 1 Xeon E5-2680 • Improve performance requires additional work.
  • 20. Acknowledge The authors would like to thank Intel for providing access to Intel Xeon Phi coprocessor.
  • 21. Questions Andrés Gómez José Carlos Mouriño Carmen Cotelo Aurelio Rodríguez The TEAM