HPCLib & Excel : An efficient way to compute with Xeon PHI

Retour d’expérience Xeon PHI
―――
Février 2012
Practice HPC
07/05/2014 © ANEO – Tous droits réservés
>
aneothe other solution
HPCLib & Excel :
An efficient way to
compute with Xeon Phi
Damien Dubuc

HPCLib & Excel :
An efficient way to compute
with Xeon PHI
Damien DUBUC
HPC software Expert
―――
April 2014
Practice HPC
07/05/2014 © ANEO – Tous droits réservés 2

Summary
What are our roles ?
HPCLib history with Xeon Phi
General library concepts and architectures
C# and Excel use cases
Performance measures
Q&A
07/05/2014 © ANEO – All rights reserved 3
Summary

Summary
Summary
Q&A

OUR SECTORS
OURPRACTICESWhat is our Role ?
Advanced Computing Technologies : HPClib ™
BUSINESS PERFORMANCE
INFORMATION SYSTEMS
INDUSTRIAL SYSTEMS
The Information System practice
intervenes in the
governance, building and exploitation
of companies’ information systems.
The Industrial Systems practice
is specialized in the industrial sector
and includes all of ANEO’s consulting
services.
The Business Performance practice
comprises consulting offers in
organization and operational
excellence.

HPC expert in all steps
Presentation
Proposal
Concepts
Convict
o deal with client projects.
o analyze and find bottlenecks in all algorithms.
o revisit the algorithms
o find the most adapted technologies.
o (re-) organize the project to use the new technology
o optimize the algorithm locally
o change, Parallelize the whole client algorithm
o search the maximum benefit of the chosen technology
o validate the numerical precision
o validate the optimization
o help a client to have a self-sufficient team

Feedback with the Xeon PHI product
projects
 4 experiences of Xeon PHI with our clients
 1 project of financial model and calibration.
 2 projects of financial risk calculation.
 1 project of automation.
 2 experiences at ANEO
 HPCLib linear algebra library.
 Benchmark architectures

Summary
Summary
Q&A

HPCLib history with Xeon PHI
Our Xeon PHI Intel product
Entity Caracteristic
Processor 60 cores 1.053 GHz / 240 Threads
Gflops (max) To 1 TeraFlops
RAM 8 Go with a bandWidht 320 GB/s
Slot of KNC PCIe x16 Gen 2
Memory Cache 32 KB de L1 et 512 KB L2 (per core)
Operating System Linux
Instructions X86 512 bits
TDP 225 W
Host OS Red Hat Entreprise Linux 6.x et Suse Linux 12+

HPCLib history with Xeon PHI
Source Code
CPU CPU PHI
PHI
CPU
PHI
Compute
Results Results Results ResultsResults
PHI ComputeCompute
PHI
Compute
PHI
Compute
Compute
CPU execution
Heterogeneous
execution
Asynchronous
execution
KNC execution
Sequential Parallel Native

Summary
Summary
Q&A

Windows C# Framework
Design
Intel Compiler
HPCLib (low level)
Intel / GCC / Windows Compiler
HPCLib ( C++ wrapper)
HPCLib c# (High Level)
HPCLib Excel Addin
Driver
Operating system

HPCLib .Net
HPCLib
Design
Calculation
Interface
Solver
Container
Mathematical
function
C# Excel
User
C/C++
Wrapper
H
A
R
D
w
A
R
E
OPEN CL
CUDA
NVidia
GPUAMD
C/C++
CPU
C/C++
Xeon PHI
User
?

Precision (Long, Int, Float, Double)
Usage
To use a container :
07/05/2014 © ANEO – Tous droits réservés 14
Type of object (Vector, Matrix,…)
Architecture
How to use ?
To use an algorithm :
float
MIC
Hpclib::hpcMatrix < , >( ,ROWS COLS )
float
MIC
hpclib::SVD < , >( ,Arg1 Arg2 )
Architecture :
Architecture :
Precision :
Precision :

HPCLib Example
Simple presentation of HPCLib with C++

General design
Linear aglebra, transposed, FFT,
RNG, AES…
Vectors, sparse vectors, matrices, sparse matrices,
tridiagonal matrices…
CPU, XEON PHI, GPU
User
Containers
Hardware
Mathematical
functions
Resolution
algorithms
Client
CG-Stab, bi-CG-Stab,
SVD, RRLSQR…

Mathematical functions
HPClib contains classic mathematical functions as well as complex programmed
functions:
Add, subtract, multiplication...
Standards of norm 1 and 2 of vectors
Audit functions of the matrix invertibility
Matrix transpositions
Fourier transformation (1D, 2D, 3D)
Fourier inversion (1D, 2D, 3D) on matrices
Matrix multiplication, solid and sparse
HPCLib comprises different random number generation and encryption algorithms
Random number generation (sobols, stddev, …)
File and data flows encryption (AES 128, 256 bits)

Resolution algorithms
HPClib contains complex and programmed resolution algorithms:
Conjugate gradient
For all matrices
Bi-Conjugate gradient stable
For all matrices
Singular Value Decomposition
For solid and sparse matrices
GmRes
 For all matrices
Rrlsqr
 For all matrices
Matrix resolution by block (ADI method)
Diagonal matrix resolution by block

Values containers
HPClib contains numerical containers generally used for intensive computing:
Vectors, sparse vectors, Solid matrices, sparse matrices, Tri-diagonal matrices, Matrices with
sparse vectors
• The R&D team continuously strives to enrich the algorithm bank of HPClib

Technical characteristics
All algorithms and containers are implemented on CPU, Xeon PHI and GPU
HPClib is written in C++ language. Consequently, the library can be used with any
language able to import a C/C++ library
Mathematical libraries can integrate HPCLib. (MKL, Blas)
HPClib is usable under Windows and Linux
Daily non-regression testing taken from concrete client cases
Black-Scholes
Monte-Carlo
Cryptography

Summary
Summary
Q&A

Simple usage of HPCLib on C#
HPCLib & Excel

Simple usage of HPCLib on C#
HPCLib & Excel ; Formula
How to build a HPC formula :
Format is :
Example :
Or :
= CPU (Array1[HPCVector result]) = Array1[Vector A] + Array1[Vector B]
How to call an algorithm
= ARCHITECTURE ( Result_ARRAY ) = Matrix/Vector formula
= MIC (G11:G13) = C5:E7 * G2:G4
= MIC (P43:P45) = SVD (C10:E45; L10:L45; P43:P45)

Simple usage of HPCLib on Excel
HPCLib & Excel

HPCLib on Excel : Metal plate metrology example
HPCLib & Excel

Simple usage of HPCLib on Excel
HPCLib & Excel
Coord (X)
Coord (Y)
0 0,520000 0,541000 0,562000
0,3 0,604000 0,625000 0,646000
0,5 0,660000 0,681000 0,600000
0,68 0,710400 0,731400 0,762800
1,17 0,847600 0,868600 0,889600
2,22 0,841360 0,980000 0,843600
0,30 0,15

Summary
Summary
Q&A

Performance Measures

Summary
Summary
Q&A

Q&A
Why HPCLib?
An algorithm bank, simple to use and allowing a quick response to client’s needs.
The HPClib memory is managed internally and adapted to mathematical models.
The cost of implementation of new functionalities is low.
New technological architectures are easy to integrate.
We’re using the HPCLib library to ensure the operational feasibility of a project
Why do it on windows and Excel ? :
Compose all formulas from Excel and the performance of the architectures
Very Close to be an open source project !!!

Q&A
Questions ?
Are you shy ?
ddubuc@aneo.fr

HPCLib & Excel : An efficient way to compute with Xeon PHI

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to HPCLib & Excel : An efficient way to compute with Xeon PHI

Similar to HPCLib & Excel : An efficient way to compute with Xeon PHI (20)

More from ANEO

More from ANEO (6)

Recently uploaded

Recently uploaded (20)

HPCLib & Excel : An efficient way to compute with Xeon PHI