Risk managementusinghadoop
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
257
On Slideshare
257
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
2
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Welcome to Redefining PerspectivesNovember 2012
  • 2. Capital Markets Risk ManagementAnd HadoopKevin Samborn andNitin Agrawal © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 2
  • 3. Agenda• Risk Management• Hadoop• Monte Carlo VaR Implementation•Q&A © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 4
  • 4. Risk Management © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 5
  • 5. What is Risk Management• Risk is a tool – the goal is to optimize and understand risk o Too much risk is locally and systemically dangerous o Too little risk means the firm may be “leaving profit on the table”• Portfolio exposure o Modern portfolios contain many different types of assets o Simple instruments, Complex instruments and derivatives• Many types of risk measures o Defined scenario-based stress testing o Value at Risk (VaR) o “Sensitivities”• Key is valuation under different scenarios• VaR is used in banking regulations, margin calculations and risk management © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 6
  • 6. Value at Risk (VaR)• VaR is a statistical measure of risk – expressed as amount of loss given probability %. E.g. 97.5% chance that the firm will not lose more than 1mill USD over the next 5 days• Computing VaR is a challenging data sourcing and compute intensive process• VaR calculation: o Generate statistical scenarios of market behavior o Revalue the portfolio for each scenario, compare returns to today’s value o Sort results and select the desired percentage return: VALUE AT RISK• Different VaR techniques: o Parametric – analytic approximation o Historical – captures real (historical) market dynamics o Monte Carlo – many scenarios, depends on statistical distributions © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 7
  • 7. VaR GraphicallySource: An Introduction To Value at Risk (VAR), Investopedia, May 2010 © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 8
  • 8. Complexities• For modern financial firms, VaR is complex. Calculation requirements: o Different types of assets require different valuation models • Risk-based approach • Full revaluation o With large numbers of scenarios, many thousands of calculations are required o Monte Carlo simulations require significant calibration, depending on large historical data• Many different reporting dimensions o VaR is not additive across dimensions. Product/asset class, Currency o Portfolio – including “what-if” and intraday activity• Intraday market changes requiring new simulations• Incremental VaR – how does a single (new) trade contribute to the total © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 9
  • 9. Backtesting VaR © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 10
  • 10. © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 11
  • 11. Hadoop Core• Data stored with REDUNDANCY on a • Provides an EASY ABSTRACTION for Distributed File System processing large data sets• Abstracts H/W FAILURES delivering a • Infrastructure for PARALLEL DATA highly-available service on PROCESSING across huge COMMODITY H/W Commodity cluster• SCALES-UP from single to thousands • Infrastructure for TASK and LOAD of nodes MANAGEMENT• Data stored WITHOUT A SCHEMA • Framework achieves DATA-PROCESS• Tuned for SEQUENTIAL DATA ACCESS LOCALITYMakes two critical assumptions though:• Data doesn’t need to be updated• Data doesn’t need to be accessed randomly © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 12
  • 12. A Simple Map Reduce Job Problem Statement: From historical price data, create frequency distribution of 1-day %age change for various stocksStock Date Open Close BP|1, 33 Map 1 S Reduce 1BP 23-Nov 435.25 435.5 O BP|2, 64NXT 23-Nov 3598 3620 R …MKS 23-Nov 378.5 380.7BP 22-Nov 434.8 433.6 TNXT 22-Nov 3579 3603 Map 2 / Reduce 2 NXT|81, 2MKS 22-Nov 377.8 378 S NXT|-20, 5BP 21-Nov 430.75 433 HNXT 21-Nov 3574 3582 … UMKS 21-Nov 375 376 F Reduce 3 Output3BP 20-Nov 430.9 432.25 FNXT 20-Nov 3592 3600MKS 20-Nov 373.7 375.3 Map M LBP 19-Nov 422.5 431.6 ENXT 19-Nov 3560 3600MKS 19-Nov 368.5 372.6 Reduce N Output NBP 16-Nov 423.9 416.6NXT 16-Nov 3575 3542MKS 16-Nov 370.3 366.4BP public void reduce(Text key, Iterable<IntWritable> values, 15-Nov public void map(LongWritable key, Text value, Context 422 425.4NXT Context context) throws IOException, InterruptedException { 15-Nov 3596 3550 context) throwsLong> freqDist InterruptedException { Map<Integer, IOException, = buildFreqDistribution(values);MKS 15-Nov 376.5 370.6 SecurityAttributes sa = Set<Integer> percentChanges = freqDist.keySet(); RecordsReadHelper.readAttribs(value.toString()); for (Integer percentChange : percentChanges) { context.write(new Text(sa.getTicker()), + "|" + percentChange.toString()), context.write(new Text(key.toString() new IntWritable(sa.getPercentChange())); new LongWritable(freqDist.get(percentChange))); } } © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 13
  • 13. Hadoop Ecosystem | How/Where These Fit VISUALIZATION TOOLS USERS DATA WAREHOUSE PROCESSING Sqoop Zoo hiho Keeper Scribe HUE Flume LOAD STORAGE SUPPORT © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 14
  • 14. Monte-Carlo VaR Implementation © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 15
  • 15. Monte Carlo VaR2 Steps IBM … … MSFT … … HLV1 = (∑AiVi) 1 IBM.CO … … HLV2 = (∑AiVi) 2 … … … … … … V1 V2 V3 V10,000 HLV10k= (∑AiVi) 10k Aggregation SIMULATION Aggregation AGGREGATIONChallenges  Daily trade data could be massive  Valuations are Compute intensive  VaR is not a simple arithmetic sum across hierarchies © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 16
  • 16. IBM …… MSFT …… IBM.CO …… …… …… Simulation Step - MapReduce V1 V2 V3 SIMULATIONMAP REDUCE- Read-through portfolio data - For the Underlyer, perform 10k random- Emit (K,V) as walks in parallel(Underlyer,InstrumentDetails) - For each random walk output, simulatee.g. (IBM, IBM.CO.DEC14.225) derivative prices - Emit 10k sets of simulated prices of the stock and associated derivatives i.e. IBM , [V1, V2, …..V10000] IBM.CO.DEC14.225 , [V1, V2, …..V10000]Job job = new Job(getConf());SecurityAttributes stockAttrib = (SecurityAttributes) iter.next();job.setJobName("RandomValuationGenerator");simPricesStock = getSimPricesForStock(stockAttrib);job.setMapperClass(SecurityAttributeMapper.class);writeReducerOutput(stockAttrib, simPricesStock, context);job.setReducerClass(PriceSimulationsReducer.class);…public void BlackScholesMertonPricingOption(); Context context) throws IOException,bsmp = new map(LongWritable key, Text value,InterruptedException { {while (iter.hasNext())SecurityAttributes sa secAttribs = iter.next(); SecurityAttributes = RecordsReadHelper.readAttribs(value.toString()); writeReducerOutput(secAttribs,getSimPricesForOptions( context.write(new Text(sa.getUnderlyer()), sa);} simPricesStock, bsmp, secAttribs), context);} © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 17
  • 17. HLV1 = … iVi)= HLV2 (∑A 1Aggregation Step MapReduce (∑AiVi) 2 …Aggregation AggregationMAP REDUCE- Read-through de-normalized • For the hierarchy level (e.g. US|ERIC),portfolio data perform ∑AiVi for each simulation and- Emit (K,V) as (Hierarchy-level, get simulated portfolio values - HLViPosition Details) • Sort HLVi , find 1%, 5% and 10% valuesUS , [IBM, 225, 191.23] and emit position and VaR dataUS|Tech , [IBM, 400, 191.23]US|Tech|Eric , [IBM, 400, 191.23]Map<String, Double> portfolioPositionData = combineInputForPFPositionData(rows);Map<String, Double[]> simulatedPrices=protected void map(LongWritable key, HoldingWritable value, Context context)loadSimulatedPrices(portfolioPositionData.keySet()); throws java.io.IOException ,InterruptedException {for(long i=0; i<NO_OF_SIMULATIONS-1; i++) { SecurityAttributes sa = RecordsReadHelper.readAttribs(value.toString()); simulatedPFValues.add(getPFSimulatedValue(i, Set<String> hierarchyLevels = sa.getHierarchyLevels();portfolioPositionData, simulatedPrices)); } for (String hierarchyLevel : hierarchyLevels) {Collections.sort(simulatedPFValues); context.write(new Text(hierarchyLevel), newText(sa.getPositionDtls()));simulatedPFValues);emitResults(portfolioPositionData, } © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 18
  • 18. DEMO RUN © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 21
  • 19. Observations• As expected, processing time of Map jobs increased marginally when input data volume was increased• Process was IO-bound on Simulation’s Reduce job as intermediate data emitted was huge• Data replication factor needs to be chosen carefully• MapReduce jobs should be designed such that Map/Reduce output is not huge © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 22
  • 20. Questions?© COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 23
  • 21. Thank You!© COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 24
  • 22. Appendix © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 25
  • 23. © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 26
  • 24. © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 27
  • 25. Let’s build a Simple Map Reduce Job Problem Statement: Across a huge set of documents, we need to find all locations (i.e. document, page, line) for all words having more than 10 characters.DATANODE2 STORAGEDATANODE1 Store Map © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 28
  • 26. © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 29