SlideShare a Scribd company logo
1 of 33
Download to read offline
Experts in numerical algorithms
and HPC services
NAG for HPC Finance
John Holden
john.holden@nag.co.uk
14th July 2015
The good, bad and ugly of
accelerators in finance
and an alternative a
complementary path
2
 NAG Introduction
 NAG for HPC Finance
 Why Quants Love NAG
 Accelerators
 NVIDIA
 Intel Xeon-Phi
 Algorithmic Differentiation
 Summary
Agenda
3
 Founded 1970
 Not-for-profit organisation
 Surpluses fund on-going R&D
 Mathematical and Statistical Expertise
 Numerical Libraries of components
 Consulting
 HPC Services
 Computational Science and Engineering (CSE) support
 Procurement advice, market watch, benchmarking
NAG Background
4
HPC Services
 Government, Academic
and Commercial
 Full CSE service
 Code porting, tuning,
scaling, rewriting…
 Training
 1-20 FTEs per annum
 Procurement
advice/benchmarking
ARM
5
Financial Services
 Many clients in FSI
 Most Tier 1 Banks have
licences
 > 60% have global
licences
 Typically the NAG
Library is embedded in
the banks own “quant”
libraries (C++, . NET,
Java, Python,…)
6
 NAG Introduction
 NAG for HPC Finance
 Why Quants Love NAG
 Accelerators
 NVIDIA
 Intel Xeon-Phi
 Algorithmic Differentiation
 Summary
Agenda
7
Why Quants use NAG Libraries and Toolboxes?
 Global reputation for quality – accuracy, reliability
and robustness…
 Extensively tested, supported and maintained code
 Reduces development time
 Allows concentration on your key areas
 Components
 Fit into your environment
 Simple interfaces to your favourite packages
 Regular performance improvements!
 Give “qualified error” messages e.g. tolerances of
answers
8
from Finance - k Factor Problem
9
from Finance - k Factor Problem
Principal Factors method (Andersen et al., 2003)
does NOT always converge to correct answer…
(no convergence theory)
Should have come to NAG….
Our* spectral projected gradient method respects
constraints, exploits convexity, converges to a feasible
stationary point
*NAG Library G02AE - Borsdorf, Higham & Raydan, 2010,
10
NAG Library and Toolbox Contents
 Root Finding
 Summation of Series
 Quadrature
 Ordinary Differential
Equations
 Partial Differential Equations
 Numerical Differentiation
 Integral Equations
 Mesh Generation
 Interpolation
 Curve and Surface Fitting
 Optimization
 Approximations of Special
Functions
 Dense Linear Algebra
 Sparse Linear Algebra
 Correlation & Regression
Analysis
 Multivariate Methods
 Analysis of Variance
 Random Number Generators
 Univariate Estimation
 Nonparametric Statistics
 Smoothing in Statistics
 Contingency Table Analysis
 Survival Analysis
 Time Series Analysis
 Operations Research
11
Use of NAG Software in Finance
 Portfolio allocation / Risk management /Stress testing
 Optimization , interpolation, linear algebra, RNGs, Distributions,
Copulas…
 Derivative pricing, Hedging
 PDEs, RNGs, multivariate normal, curve & surface fitting,
quadrature…
 Calibration
 Optimisation, Interpolation , Root Finders, Splines
 Data analysis
 Time series, GARCH, principal component analysis, data smoothing,
Data Mining…
 Monte Carlo simulation
 RNGs, Brownian Bridge constructor, Linear Algebra
 …
12
Why Quantitative Analysts Love NAG?
 General Problem
 To build asset models and risk engines in a timely manner
that are
 Robust
 Stable
 Quick
 Solution
 Use robust, well tested, fast numerical components
 This allows the “expensive” experts to concentrate on the
modelling and interpretation
 avoiding distraction with low level numerical components
13
Problem 1: Simulation (Monte Carlo)
 Simulation is important for scenario generation
 Several different numerical components needed
 Random Number Generators
 Brownian bridge constructor
 Interpolation/Splines
 Principal Component Analysis
 Cholesky Decomposition
 Distributions (uniform, Normal, exponential gamma,
Poisson, Student’s t, Weibull,..)
 ..
14
Problem 1: Simulation (Monte Carlo)
 Simulation is important for scenario generation
NAG to the rescue (CPU or GPU)
 Several different numerical components needed
 Random Number Generators √
 Brownian bridge constructor √
 Interpolation/Splines √
 Principal Component Analysis√
 Cholesky Decomposition √
 Distributions (uniform, Normal, exponential gamma,
Poisson, Student’s t, Weibull,..)√
 .. √ √
15
Problem 2: Calibration
 Financial institutions all need to calibrate their
models
 Several different numerical components needed
 Optimisation functions (e.g. constrained non-linear
optimisers)
 Interpolation functions (used intelligently*)
 Spline functions
 ..
*interpolator must be used carefully –must know the properties to pick appropriate method
16
Problem 2: Calibration
 Financial institutions all need to calibrate their
models
NAG to the rescue
 Several different numerical components needed
 Optimisation functions (e.g. constrained non-linear
optimisers) √
 Interpolation functions (used intelligently*) √
 Spline functions √
 .. √ √
*interpolator must be used carefully –must know the properties to pick appropriate method
17
 NAG Introduction
 NAG for HPC Finance
 Why Quants Love NAG
 Accelerators
 NVIDIA
 Intel Xeon-Phi
 Algorithmic Differentiation
 Summary
Agenda
18
 Escalator?:
Want more performance? Buy the next processor!
 To get performance/efficiency we have to go
(massively) parallel
 Disruption causing serious look at ‘other’
technologies and algorithms!
 Even CPUs with tens of cores per node
 Hybrid, shared-memory and distributed-memory
parallelism
 Painful whichever way we turn!
Where has my Escalator gone?
19
 Loose definition: hardware on which to run your
software better than on your (general purpose) CPU
 Generally NOT an easy win
 Significant learning curve and effort
 Offload disadvantages
 …
Accelerators
20
 ClearSpeed
 Similar to GPU
 Lacked a good software eco-system
 IBM Cell
 Lacked a good software eco-system
 GPGPU
 NVIDIA invested in the software eco-system (AMD not!)
 Intel Phi
 Early days – an encouraging start
 Expecting a lot more with Knights Landing!
Accelerators
21
 We provide
 A suite of Numerical Routines for Monte Carlo simulation
 from a collaboration with Professor Mike Giles
 also MAGMA based Linear Algebra from Jack Dongarra
 worked with Professor William Shaw to implement new Inverse
CDFs (new distributions and speed up to existing code)
 “Bespoke” consultancy codes
 PDE Solver for Stochastic Local Volatility
 FX Basket Option, Local Vol Model
 Solutions combining GPU and Algorithmic Differentiation
 More on that later….
 Training courses for CUDA and Open CL
NAG and GPGPUs (NVIDIA)
22
 Relatively easy to take existing
OpenMP based code and port to
Phi
NAG and Intel Xeon-Phi
 Tuning for Phi takes some learning and expertise
 … but feedback into Xeon code is often very strong
 Performance Issues
 As always, need large enough problems to make the offload
worthwhile
 seems to have significant offload overheads
 NAG Library for Intel Xeon-Phi available
23
0
50
100
150
200
250
300
350
400
450
0 5000 10000 15000 20000 25000 30000
Time(s)
Problem Size (n)
NAG Distance Matrix (g03ea) – Intel Xeon Phi
32 threads original Phi offload original Phi offload opt 32 threads opt
 n=30k; m=3k
 Xeon 32t: 192s
 Xeon 32t*: 75.7s
 Phi 240t*: 40.6s
 Phi gain ~5x over
original or ~2x over
optimised
24
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
1.60
100 10,000 1,000,000 100,000,000
Time(s)
Size of problem (n, log scale)
Uniform RNG – NAG Mersenne Twister (g05sa) – Intel Xeon Phi
8 threads original Native Phi original Native Phi opt 8 threads opt
 n=500m
 Xeon 8t: 0.25s
 Phi 240t:1.50s
 Xeon 8t*: 0.22s
 Phi gain ~3x
25
NAG & AD: Algorithmic Differentiation in a nutshell
Computers can only add, subtract, multiply and divide
numbers.
 A computer program implementing a model is many
of these basic operations strung together
 Elementary to compute the derivatives of these
 Chain rule + basic derivatives = program derivative
 Classes, templates and operator overloading can do
this efficiently and non-intrusively
26
To get acceleration look at the algorithms
0
50
100
150
200
250
300
350
50 100 150 200
Runtime(s)
Number of inputs (size of Delta)
5000 gradient evaluations of LIBOR Market Model*
using finite
differences
(bumping)
using adjoints
2nd-order
adjoints
(projected
Hessian)
*M.B. Giles and P.
Glasserman. `Smoking
adjoints: fast Monte Carlo
Greeks', RISK, January
2006
27
Computing derivatives in finance is important…
 Calculating a product’s sensitivities to a range of risk
factors (a.k.a. Greeks) creates huge computational
demand on risk and price models
 Traditional approach “bumping” - finite differences
 Which is Computationally very expensive.. more hardware!
The alternatives to finite differences are
 Write derivative code by hand
 Efficient, but difficult to write & highly error prone (need to
develop original and adjoint models)
 Algorithmic Differentiation
 flexible and just develop the original model - obvious choice
28
NAG and AAD
 Adjoint Algorithmic Differentiation (AAD) reduces
Runtime
 With RWTH Aachen University (Prof. Uwe Naumann
et al.) NAG are delivering Algorithmic Differentiation
(AD) tools and services to the finance community for C
/C++/CUDA codes.
 Our example codes include
 LIBOR Market Model
 PDE based Local Volatility model
 GPU accelerated Local Vol Basket Option pricer
 Our solutions deliver for accelerators
29
A few numbers
Monte Carlo
n f cfd AD ADf cfdAD
34 0.5s 29.0s 3.0s (2.5MB) 6.0x 9.7x
62 0.7s 80.9s 5.1s (3.2MB) 7.3x 15.9x
142 1.5s 423.5s 12.4s (5.1MB) 8.3x 34.2x
22 2.3s 1010.7s 24.4s (7.1MB) 10.6x 41.4x
PDE
34 0.6s 37.7s 11.6s (535MB) 19.3x 3.3x
62 1.0s 119.5s 18.7s (919MB) 18.7x 6.4x
142 2.6s 741.2s 39s (2GB) 15.0x 19x
222 4.1s 1857.3s 60s (3GB) 14.6x 31x
30
AAD and GPUs
 “AAD Vs GPUs: banks turn to maths tricks as chips lose
appeal” risk.net, Jan 2015 – NONSENSE… surely
combining AAD and GPUs make the ultimate
accelerator!
 “…. Join the AAD revolution” risk.net, July 15 – making
more a lot more sense…
“In computational finance, there is no silver bullet. AAD is an
algorithmic advance… …GPUs are parallel compute
accelerations. The two are complementary” J. Ashley, IBM
 NAG is already delivering “combined” solutions to
our clients (in FSI and other sectors)
31
 NAG Introduction
 NAG for HPC Finance
 Why Quants Love NAG
 Accelerators
 NVIDIA
 Intel Xeon-Phi
 Algorithmic Differentiation
 Summary
Agenda
32
 NAG is keen to collaborate in building models and
risk engines
 Requirements are likely to be varied across FSI
 We want to make sure we have what you need
 The importance of HPC Finance is growing and will
involve a LOT of computation (Basel III, CVA,…)
 NAG has significant experience in HPC libraries, services,
consulting and training
 We know how to do large scale computations efficiently
 This is non-trivial! Our expertise has been sought out and
exploited by organisations such as (AMD, HECToR,
Microsoft, Oracle, major banks, major oil & gas cos,…….)
HPC Finance - Summary
33
 www.nag.co.uk
 AD explained http://www.nag.co.uk/pss/nag-and-
algorithmic-differentiation
 Adjoint Algorithmic Differentiation Tool Support for Typical
Numerical Patterns in Computational Finance
http://www.nag.co.uk/doc/techrep/pdf/tr3_14.pdf
 Adjoint Algorithmic Differentiation of a GPU Accelerated
Application http://www.nag.co.uk/Market/articles/adjoint-
algorithmic-differentiation-of-gpu-accelerated-app.pdf
References

More Related Content

Similar to ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path

Ag o product overview
Ag o product overviewAg o product overview
Ag o product overview
Manoj Nagesh
 
The Principle Of Ultrasound Imaging System
The Principle Of Ultrasound Imaging SystemThe Principle Of Ultrasound Imaging System
The Principle Of Ultrasound Imaging System
Melissa Luster
 
ICLR 2020 Recap
ICLR 2020 RecapICLR 2020 Recap
ICLR 2020 Recap
Sri Ambati
 

Similar to ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path (20)

Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
 
Numerical Excellence In Finance N A G Jan2010
Numerical Excellence In Finance N A G Jan2010Numerical Excellence In Finance N A G Jan2010
Numerical Excellence In Finance N A G Jan2010
 
AI-accelerated CFD (Computational Fluid Dynamics)
AI-accelerated CFD (Computational Fluid Dynamics)AI-accelerated CFD (Computational Fluid Dynamics)
AI-accelerated CFD (Computational Fluid Dynamics)
 
The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...The CAOS framework: Democratize the acceleration of compute intensive applica...
The CAOS framework: Democratize the acceleration of compute intensive applica...
 
Ag o product overview
Ag o product overviewAg o product overview
Ag o product overview
 
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
“An Industry Standard Performance Benchmark Suite for Machine Learning,” a Pr...
 
Nag software For Finance
Nag software For FinanceNag software For Finance
Nag software For Finance
 
Production model lifecycle management 2016 09
Production model lifecycle management 2016 09Production model lifecycle management 2016 09
Production model lifecycle management 2016 09
 
The Principle Of Ultrasound Imaging System
The Principle Of Ultrasound Imaging SystemThe Principle Of Ultrasound Imaging System
The Principle Of Ultrasound Imaging System
 
Deep learning for FinTech
Deep learning for FinTechDeep learning for FinTech
Deep learning for FinTech
 
Constraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in SchedulingConstraint Programming - An Alternative Approach to Heuristics in Scheduling
Constraint Programming - An Alternative Approach to Heuristics in Scheduling
 
Deep Learning on Everyday Devices
Deep Learning on Everyday DevicesDeep Learning on Everyday Devices
Deep Learning on Everyday Devices
 
computer architecture.
computer architecture.computer architecture.
computer architecture.
 
ICLR 2020 Recap
ICLR 2020 RecapICLR 2020 Recap
ICLR 2020 Recap
 
Recommendations for Building Machine Learning Software
Recommendations for Building Machine Learning SoftwareRecommendations for Building Machine Learning Software
Recommendations for Building Machine Learning Software
 
The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...The CAOS framework: democratize the acceleration of compute intensive applica...
The CAOS framework: democratize the acceleration of compute intensive applica...
 
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
Justin Basilico, Research/ Engineering Manager at Netflix at MLconf SF - 11/1...
 
Ch1
Ch1Ch1
Ch1
 
Ch1
Ch1Ch1
Ch1
 
Intro to Quantitative Investment (Lecture 2 of 6)
Intro to Quantitative Investment (Lecture 2 of 6)Intro to Quantitative Investment (Lecture 2 of 6)
Intro to Quantitative Investment (Lecture 2 of 6)
 

Recently uploaded

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

ISC Frankfurt 2015: Good, bad and ugly of accelerators and a complementary path

  • 1. Experts in numerical algorithms and HPC services NAG for HPC Finance John Holden john.holden@nag.co.uk 14th July 2015 The good, bad and ugly of accelerators in finance and an alternative a complementary path
  • 2. 2  NAG Introduction  NAG for HPC Finance  Why Quants Love NAG  Accelerators  NVIDIA  Intel Xeon-Phi  Algorithmic Differentiation  Summary Agenda
  • 3. 3  Founded 1970  Not-for-profit organisation  Surpluses fund on-going R&D  Mathematical and Statistical Expertise  Numerical Libraries of components  Consulting  HPC Services  Computational Science and Engineering (CSE) support  Procurement advice, market watch, benchmarking NAG Background
  • 4. 4 HPC Services  Government, Academic and Commercial  Full CSE service  Code porting, tuning, scaling, rewriting…  Training  1-20 FTEs per annum  Procurement advice/benchmarking ARM
  • 5. 5 Financial Services  Many clients in FSI  Most Tier 1 Banks have licences  > 60% have global licences  Typically the NAG Library is embedded in the banks own “quant” libraries (C++, . NET, Java, Python,…)
  • 6. 6  NAG Introduction  NAG for HPC Finance  Why Quants Love NAG  Accelerators  NVIDIA  Intel Xeon-Phi  Algorithmic Differentiation  Summary Agenda
  • 7. 7 Why Quants use NAG Libraries and Toolboxes?  Global reputation for quality – accuracy, reliability and robustness…  Extensively tested, supported and maintained code  Reduces development time  Allows concentration on your key areas  Components  Fit into your environment  Simple interfaces to your favourite packages  Regular performance improvements!  Give “qualified error” messages e.g. tolerances of answers
  • 8. 8 from Finance - k Factor Problem
  • 9. 9 from Finance - k Factor Problem Principal Factors method (Andersen et al., 2003) does NOT always converge to correct answer… (no convergence theory) Should have come to NAG…. Our* spectral projected gradient method respects constraints, exploits convexity, converges to a feasible stationary point *NAG Library G02AE - Borsdorf, Higham & Raydan, 2010,
  • 10. 10 NAG Library and Toolbox Contents  Root Finding  Summation of Series  Quadrature  Ordinary Differential Equations  Partial Differential Equations  Numerical Differentiation  Integral Equations  Mesh Generation  Interpolation  Curve and Surface Fitting  Optimization  Approximations of Special Functions  Dense Linear Algebra  Sparse Linear Algebra  Correlation & Regression Analysis  Multivariate Methods  Analysis of Variance  Random Number Generators  Univariate Estimation  Nonparametric Statistics  Smoothing in Statistics  Contingency Table Analysis  Survival Analysis  Time Series Analysis  Operations Research
  • 11. 11 Use of NAG Software in Finance  Portfolio allocation / Risk management /Stress testing  Optimization , interpolation, linear algebra, RNGs, Distributions, Copulas…  Derivative pricing, Hedging  PDEs, RNGs, multivariate normal, curve & surface fitting, quadrature…  Calibration  Optimisation, Interpolation , Root Finders, Splines  Data analysis  Time series, GARCH, principal component analysis, data smoothing, Data Mining…  Monte Carlo simulation  RNGs, Brownian Bridge constructor, Linear Algebra  …
  • 12. 12 Why Quantitative Analysts Love NAG?  General Problem  To build asset models and risk engines in a timely manner that are  Robust  Stable  Quick  Solution  Use robust, well tested, fast numerical components  This allows the “expensive” experts to concentrate on the modelling and interpretation  avoiding distraction with low level numerical components
  • 13. 13 Problem 1: Simulation (Monte Carlo)  Simulation is important for scenario generation  Several different numerical components needed  Random Number Generators  Brownian bridge constructor  Interpolation/Splines  Principal Component Analysis  Cholesky Decomposition  Distributions (uniform, Normal, exponential gamma, Poisson, Student’s t, Weibull,..)  ..
  • 14. 14 Problem 1: Simulation (Monte Carlo)  Simulation is important for scenario generation NAG to the rescue (CPU or GPU)  Several different numerical components needed  Random Number Generators √  Brownian bridge constructor √  Interpolation/Splines √  Principal Component Analysis√  Cholesky Decomposition √  Distributions (uniform, Normal, exponential gamma, Poisson, Student’s t, Weibull,..)√  .. √ √
  • 15. 15 Problem 2: Calibration  Financial institutions all need to calibrate their models  Several different numerical components needed  Optimisation functions (e.g. constrained non-linear optimisers)  Interpolation functions (used intelligently*)  Spline functions  .. *interpolator must be used carefully –must know the properties to pick appropriate method
  • 16. 16 Problem 2: Calibration  Financial institutions all need to calibrate their models NAG to the rescue  Several different numerical components needed  Optimisation functions (e.g. constrained non-linear optimisers) √  Interpolation functions (used intelligently*) √  Spline functions √  .. √ √ *interpolator must be used carefully –must know the properties to pick appropriate method
  • 17. 17  NAG Introduction  NAG for HPC Finance  Why Quants Love NAG  Accelerators  NVIDIA  Intel Xeon-Phi  Algorithmic Differentiation  Summary Agenda
  • 18. 18  Escalator?: Want more performance? Buy the next processor!  To get performance/efficiency we have to go (massively) parallel  Disruption causing serious look at ‘other’ technologies and algorithms!  Even CPUs with tens of cores per node  Hybrid, shared-memory and distributed-memory parallelism  Painful whichever way we turn! Where has my Escalator gone?
  • 19. 19  Loose definition: hardware on which to run your software better than on your (general purpose) CPU  Generally NOT an easy win  Significant learning curve and effort  Offload disadvantages  … Accelerators
  • 20. 20  ClearSpeed  Similar to GPU  Lacked a good software eco-system  IBM Cell  Lacked a good software eco-system  GPGPU  NVIDIA invested in the software eco-system (AMD not!)  Intel Phi  Early days – an encouraging start  Expecting a lot more with Knights Landing! Accelerators
  • 21. 21  We provide  A suite of Numerical Routines for Monte Carlo simulation  from a collaboration with Professor Mike Giles  also MAGMA based Linear Algebra from Jack Dongarra  worked with Professor William Shaw to implement new Inverse CDFs (new distributions and speed up to existing code)  “Bespoke” consultancy codes  PDE Solver for Stochastic Local Volatility  FX Basket Option, Local Vol Model  Solutions combining GPU and Algorithmic Differentiation  More on that later….  Training courses for CUDA and Open CL NAG and GPGPUs (NVIDIA)
  • 22. 22  Relatively easy to take existing OpenMP based code and port to Phi NAG and Intel Xeon-Phi  Tuning for Phi takes some learning and expertise  … but feedback into Xeon code is often very strong  Performance Issues  As always, need large enough problems to make the offload worthwhile  seems to have significant offload overheads  NAG Library for Intel Xeon-Phi available
  • 23. 23 0 50 100 150 200 250 300 350 400 450 0 5000 10000 15000 20000 25000 30000 Time(s) Problem Size (n) NAG Distance Matrix (g03ea) – Intel Xeon Phi 32 threads original Phi offload original Phi offload opt 32 threads opt  n=30k; m=3k  Xeon 32t: 192s  Xeon 32t*: 75.7s  Phi 240t*: 40.6s  Phi gain ~5x over original or ~2x over optimised
  • 24. 24 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 100 10,000 1,000,000 100,000,000 Time(s) Size of problem (n, log scale) Uniform RNG – NAG Mersenne Twister (g05sa) – Intel Xeon Phi 8 threads original Native Phi original Native Phi opt 8 threads opt  n=500m  Xeon 8t: 0.25s  Phi 240t:1.50s  Xeon 8t*: 0.22s  Phi gain ~3x
  • 25. 25 NAG & AD: Algorithmic Differentiation in a nutshell Computers can only add, subtract, multiply and divide numbers.  A computer program implementing a model is many of these basic operations strung together  Elementary to compute the derivatives of these  Chain rule + basic derivatives = program derivative  Classes, templates and operator overloading can do this efficiently and non-intrusively
  • 26. 26 To get acceleration look at the algorithms 0 50 100 150 200 250 300 350 50 100 150 200 Runtime(s) Number of inputs (size of Delta) 5000 gradient evaluations of LIBOR Market Model* using finite differences (bumping) using adjoints 2nd-order adjoints (projected Hessian) *M.B. Giles and P. Glasserman. `Smoking adjoints: fast Monte Carlo Greeks', RISK, January 2006
  • 27. 27 Computing derivatives in finance is important…  Calculating a product’s sensitivities to a range of risk factors (a.k.a. Greeks) creates huge computational demand on risk and price models  Traditional approach “bumping” - finite differences  Which is Computationally very expensive.. more hardware! The alternatives to finite differences are  Write derivative code by hand  Efficient, but difficult to write & highly error prone (need to develop original and adjoint models)  Algorithmic Differentiation  flexible and just develop the original model - obvious choice
  • 28. 28 NAG and AAD  Adjoint Algorithmic Differentiation (AAD) reduces Runtime  With RWTH Aachen University (Prof. Uwe Naumann et al.) NAG are delivering Algorithmic Differentiation (AD) tools and services to the finance community for C /C++/CUDA codes.  Our example codes include  LIBOR Market Model  PDE based Local Volatility model  GPU accelerated Local Vol Basket Option pricer  Our solutions deliver for accelerators
  • 29. 29 A few numbers Monte Carlo n f cfd AD ADf cfdAD 34 0.5s 29.0s 3.0s (2.5MB) 6.0x 9.7x 62 0.7s 80.9s 5.1s (3.2MB) 7.3x 15.9x 142 1.5s 423.5s 12.4s (5.1MB) 8.3x 34.2x 22 2.3s 1010.7s 24.4s (7.1MB) 10.6x 41.4x PDE 34 0.6s 37.7s 11.6s (535MB) 19.3x 3.3x 62 1.0s 119.5s 18.7s (919MB) 18.7x 6.4x 142 2.6s 741.2s 39s (2GB) 15.0x 19x 222 4.1s 1857.3s 60s (3GB) 14.6x 31x
  • 30. 30 AAD and GPUs  “AAD Vs GPUs: banks turn to maths tricks as chips lose appeal” risk.net, Jan 2015 – NONSENSE… surely combining AAD and GPUs make the ultimate accelerator!  “…. Join the AAD revolution” risk.net, July 15 – making more a lot more sense… “In computational finance, there is no silver bullet. AAD is an algorithmic advance… …GPUs are parallel compute accelerations. The two are complementary” J. Ashley, IBM  NAG is already delivering “combined” solutions to our clients (in FSI and other sectors)
  • 31. 31  NAG Introduction  NAG for HPC Finance  Why Quants Love NAG  Accelerators  NVIDIA  Intel Xeon-Phi  Algorithmic Differentiation  Summary Agenda
  • 32. 32  NAG is keen to collaborate in building models and risk engines  Requirements are likely to be varied across FSI  We want to make sure we have what you need  The importance of HPC Finance is growing and will involve a LOT of computation (Basel III, CVA,…)  NAG has significant experience in HPC libraries, services, consulting and training  We know how to do large scale computations efficiently  This is non-trivial! Our expertise has been sought out and exploited by organisations such as (AMD, HECToR, Microsoft, Oracle, major banks, major oil & gas cos,…….) HPC Finance - Summary
  • 33. 33  www.nag.co.uk  AD explained http://www.nag.co.uk/pss/nag-and- algorithmic-differentiation  Adjoint Algorithmic Differentiation Tool Support for Typical Numerical Patterns in Computational Finance http://www.nag.co.uk/doc/techrep/pdf/tr3_14.pdf  Adjoint Algorithmic Differentiation of a GPU Accelerated Application http://www.nag.co.uk/Market/articles/adjoint- algorithmic-differentiation-of-gpu-accelerated-app.pdf References