2. T I A D
Presenters
2
Stéphane Lapie
Presales Engineer – 2 Years at Splunk
Based in Paris, France
Specialized in App Delivery, IT Ops and Cloud
Went from Dev, to Ops to Presales
(former colleague of Adegbenga at BNP Paribas)
Adegbenga Amusa
HPC Developer – BNP Paribas, Risk Systems
Based in London, United Kingdom
Worked on Grid Computing for the past 10 years
Had a fair share of Ops in my career
3. T I A D
Once Upon a Time in a Financial Institution…
4. T I A D
Service
A Group of People was Working on Enhancing
Some…
5. T I A D
• Requirements
• Service Definition
• Implementation
• Validation & Testing
• Application Monitoring
• Incident Management
Business
Development
Operations Service
Customers
And Like Many Services There Were…
6. T I A D
Service
This Group of People Was Bound to Transform
the Service
7. T I A D
Risk Calculation
Business
Analysts
Java, C++,
Cuda
Developers
Application
Support
Risk Systems
Structurers
Risk Analysts
Quant Analysts
Data Analysts
8. T I A D
Risk Calculation
Counterparty Risk
A B
• Many Simulations
• Many Parameters
• Many Regulations
• … may be slow to calculate
9. T I A D
Risk Calculation
• Fresh information
• Accurate information
• Compliance with Market Regulations
• Fully parametrized on-demand simulations
• Long term simulations
• Quick results (understand instant)
• Ease of use !!!
• Integrated with the many other systems
High Customer Expectations
Structurers
Risk Analysts
Quant Analysts
Data Analysts
10. T I A D
A New Project Was Born
10
Goal:
Rebuild the Calculation Engine to use highly parallel computing on Graphical Processing Units
• Much Quicker Calculations
• Much More Simulations Capabilities
• Easier Models Implementation
• Reduce Hardware and Middleware cost
Project Codename:
HPCE
High Performance Compute Engine
11. T I A D
A New Project Was Born
11
GPU / SIMT
CPU / SMT
Complex and
different tasks
Not so complex
scalable tasks
few very-smart cores
many cheaper cores
CPU v GPU in a nutshell
12. T I A D12
CPU v GPU in a nutshell
Monte Carlo Methods
Simulation to calculate
Value at Risk
Parallel Calculations
A New Project Was Born
13. T I A D
Project’s Timeline
13
From Zero to Hero…
Q2 Q2
Dev Production
2013 2014+
1. Fully Integrated
2. Fully Functional
3. Identical Results to Legacy
14. T I A D
Project’s Timeline
14
From Zero to Hero…
Q2 Q2
Dev Production
2013 2014+
DevOps
…to the rescue!
Fast Forward◉
15. T I A D
Project’s Timeline
15
From Zero to Hero…
Q2 Q2
Dev Production
2013 2014+
Fast Forward◉
16. T I A D
Development App Support
16
DEV UAT STG PRDTEST TEST TEST
Automation
End-to-End Visibility
Version
XYZ
Version
XYZ
Version
1.0
Version
2.0
Version
3.0.1
Version
3.0.1.1
Version
4.2.0.1
Business
TEST TEST TESTTEST TEST TESTTEST TEST TEST
Delivery Pipeline Overview
17. T I A D
Step #1
17
Open Access to Environment’s Data
18. T I A D
What is Delivered
18
High Performance Compute Engine
Non-persistent Data
Persistent
Data
Compute Farm
REST
Services
GPU
Servers
CPU
Servers
Apache
Cassandra
+
Sybase IQ
+
(File System)
Oracle
Coherence
+
Data Services
IBM Platform
Symphony
+
Calculation
Services
Apache
Tomcat
19. T I A D
Step #2
19
Create indicators for each software components,
one after another…
20. T I A D
Consistent and Reusable Analytics
20
Business App Support Development
HPCE
Non-persistent Data
Persistent
Data
Compute Farm
REST
Services
DEV / UATSTAGINGPRODUCTION
21. T I A D
Step #3
21
Make it repeatable to improve progressively!
22. T I A D
Integration within Risk Systems
22
Non-persistent Data
Persi
stent
Data
Compute Farm
HPCE
REST
Services
Risk
Data Bus
Upstream
Systems
Downstream
Systems
Risk
Navigator
Risk
Data
Warehouse
Data In/Out
Simulations Requests
Results Browsing
23. T I A D
Integration within Risk Systems
23
Non-persistent Data
Persi
stent
Data
Compute Farm
HPCE
REST
Services
Risk
Data Bus
Upstream
Systems
Downstream
Systems
Risk
Navigator
Risk
Data
Warehouse
Data In/Out
Simulations Requests
Results Browsing
End-to-End Visibility
Business
App Support
Development
24. T I A D
Step #4
24
Increase the scope to get closer to
the “full picture”
25. T I A D
Data Sources Overview
Version
Control
QA / Testing
Tools
CM / Deployment
CI / Build
Automation
23
Applications
& Systems logs
CUSTOM LOGGING
Business App Support Development
26. T I A D26
Visualize ShareAnalyzeExplore
Schema-Free data collection & storage
Code
Repository
QA / Testing
Tools
CM / Deployment
CI / Build
Automation
Applications
Monitoring
Project & Issue
Tracking
Systems
Monitoring
Distributed
Storage &
Processing
Scales to
TBs/day
= Data generated by IT Systems
Machine Data
Data Sources Overview
29. T I A D29
Compute Farm
CPU
Server
s
GPU
Servers
Non-persistent Data
Persistent
Data
High Performance Compute Engine
REST
Services
Service Requests
Development
Requests / Responses
32. T I A D32
Compute Farm
CPU
Server
s
GPU
Servers
High Performance Compute Engine
REST
Services
Non-persistent Data
Persistent
Data
Data In and Out
Business
App Support
Development
Risk
Data Bus
39. T I A D39
Persistent
Data
Non-persistent Data
High Performance Compute Engine
Compute Farm
CPU
Servers
GPU
Servers
REST
Services
Calculations Analysis
Business
App Support
Development
47. T I A D47
High Performance Compute Engine
Persistent
Data
Non-persistent Data
Compute Farm
CPU
Servers
GPU
Servers
REST
Services
Infrastructure App Support
Development
56. T I A D
Key Take Away
56
1. Don’t withhold information, open access to everyone (% sensitiveness).
Share your thoughts and analysis with the other teams.
2. Take time to define Indicators that matters.
Repeat the operation with Dev, Ops and Biz people.
3. Most analysis should be consistent and repeatable.
Eventually from the Dev computer ’til Production platform
(containers can help nowadays).
4. Don’t rush and try do everything at once.
You’ll get there quickly enough if you streamline your efforts.
57. T I A D
Lessons Learned
57
Ø Today we managed to get an “Holistic” view of our systems and we can’t imagine working
without a solution like Splunk.
We did all this very quickly with Splunk and in much more depth than previously developing
custom tools.
Ø In the past most logs were considered “garbage” but made accessible they revealed many User
Behaviors we didn’t expect. It is of tremendous value to understand what customers do, which
features they use and how. This kind of analysis can trigger changes in the solution.
We started to put Logging Best Practices in place to make sure the “garbage” can be “recycled” ;)
Ø Everybody within Risk Systems is using Splunk for ad-hoc searches and sharing search links but
very few people share Dashboards and Reports today.
Going forward we’d like to enable more people and make sure they can collaborate on
improving our internal Splunk Apps.