3. Risk Management
Market activity growing because of huge profit
Instability on global finance
sophisticated financial products,
derivatives,
credit
Market regulation – Basel for Europe
- Capital ratio n% to be blocked by financial institutions
- Can be reduced if high performance measurement system
4. Risk Management
Market Risk
Market volatility
Stress scenarios based on historical events11 September 2001
Value At Risk = probability
that maximum loss in a period
Of time does not exceed
(1 - Percentile)%
5. Risk Management
Credit Risk
Counterparty default : Country, Organisations
Market conditions stressed on time-points up to 50 years
in the future
Credit V@R : Curve of exposures for each time-point
7. Risk Management
Requirements
Availability – 99% of blades availability
Performance – low latency, high throughput
Scalability – Ideally linear scalability
Reliability – Fault tolerance, retry strategies, engines black-listing
Maintainability – Upgrade with minimal effort
Extensibility – Unstable models, should be able modify processing
Security – Control over data access
Manageability – Resource consumption statistics
policy and SLA management for multiple clients
2011 IPM - HPC4 7
8. Counterparty Exposure
• Between 6 and 40
daily run
Nb Simulations
{1.25,
• Each calculation cube: 2.33,
• 10 000 000 Deals 0.95}
• 150 time points
• 10 000 simulations
• 1 000 000 aggregations Nb time points
• -> 15 trillions operations per
calculation
• -> 450 Terabytes of
intermediate data
ls
ea
D
b
N
Risk System IT
2011 IPM - HPC4 8
10. Counterparty Exposure
• Spit by trade TimePoint
Trade 0
Trade 1
Simulation
Trades Trade n
Simulation 200
Simulation 350
Loop {Trades}
Send Trade_i to Blades_n
Fetch [Simulations] from Blade_n
next iteration
2011 IPM - HPC4 10
11. Counterparty Exposure
• Spit by trade
simulations too big to fit in the blade memory
If N simulations can fit in memory
-> transfer over network =
{Trades} * {simulations} * size(simulation) / N
= 5000 TB / N transfer
PV matrix generated on the blades
Data affinity can reduce considerably the network transfer
-> Keep simulation in the blade as a “state”
-> Client should maintain orchestration, reliability
2011 IPM - HPC4 11
13. Counterparty Exposure
TimePoint
Split by scenario Simulation 0
Simulation 1
Simulation
Trades
Trade 0 Simulation n
Trade i
Loop {Simulations}
Send {Simulations n to m} to {Blades}
Send {Trades} to {Blades}
next iteration
2011 IPM - HPC4 13
14. Counterparty Exposure
Split by scenario
Affinity -> Data centric
Scenarios sent only once to the blades
Each trade sent {scenarios} times, size (trade) << size (scenario)
PV matrix constructed on the client progressively
-> Client should maintain reliability, fault-tolerance
-> Client should maintain all the states for the generated PV
-> Too heavy, we will need multiple clients
2011 IPM - HPC4 14
15. Counterparty Exposure
Aggregation
All PV matrix required : Point to point aggregation
-> Too big to fit in client memory
-> Constant disk access
We can also distribute aggregator processes
-> Underlying problematic of reliability, fault-tolerance
2011 IPM - HPC4 15
17. Datasynapse GridServer
Director
Entry access point, authentication
Engine balancer: Moves engines between brokers depending on charges, demands,
policies (weight base,home/shared).
Routes client to brokers
Broker
Handles Client (Driver) requests
Schedules service sessions
Schedules service instances to engines
Scheduling period takes into account engine states, discriminators, blacklisting
Maintains pool of engines
2011 IPM - HPC4 17
18. Datasynapse GridServer
Engine Daemon
1 Deamon per machine
Manages Engine instances.
Interact with Director – migration between brokers
Engine instance
1 per CPU
Manages application
Communicate with Client (Driver) for data and processing
Maintains state, checkpoints, init data.
Interact with Broker, receive assignment and interruption
2011 IPM - HPC4 18
19. Datasynapse GridServer
Driver
Embedded in client
Provides API in C++, Java, .NET, SOAP
- Service oriented : loose interaction between client code and engine code
- Object oriented : Client and Service are coupled and exchange data in intermediate objects
- Pdriver : PDS scripting language – used for MPI jobs
Data generally transferred directly between client and engines – http server on driver
Data collection is immediate, later or never
For COLLECT_LATER and COLLECT_NEVER, better push the data to the broker : In case
of failure, client may not be here anymore so input data will be lost.
Possibility of pushing initial data – will be kept as static data in engine machine and
used by different instances of the service.
2011 IPM - HPC4 19
20. Datasynapse GridServer
Architecture:
Multiple broker – 1 per organisation
– Service Level Agreement (SLA)
defined for each Line Of Business
(LOB) .
Failover brokers – Service states
are stored in a database and
resubmitted through FO broker if
live broker down
Secondary director active if failure
on primary director
2011 IPM - HPC4 20
21. Datasynapse GridServer
Brokers form a “partition” of shared engines
-> Home/shared or weight based
Home/Shared
Each engine has a default broker “home” and migrates to “shared” broker if there is a demand.
Engines are interrupted in the shared broker if there is a demand in home broker.
Different parameters for defining the pace of migration and define the fluidity of the system.
Weight Based
Engine are homed to brokers indifferently, but based on the director weights by the Director (for ex :
60%-40%)
Movement between brokers follow same procedure than Home/Shared.
2011 IPM - HPC4 21
23. Counterparty Exposure
Split by Scenario (2)
//asynchronous callback
class InvocationHandler : public ServiceInvocationHandler {
public:
void handleResponse(const string &response, int id) {
cout << "response from async callback " << id << ": " << response << endl;
}
void handleError(ServiceInvocationException &e, int id) {
cout << "error from async callback " << id << ": " << e.getMessage() << endl;
}};
All the invocations will be queued in the broker.
Scheduler will forward them to the engine containing the right scenario.
InvocationHandler will be triggered by the broker to the callback after each invocation
has completed, the result data will be pushed directly by the engines to avoid
bottleneck on the broker.
2011 IPM - HPC4 23