• Like
Risk managementusinghadoop
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Risk managementusinghadoop

  • 86 views
Published

 

Published in Business , Economy & Finance
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
86
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
3
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Welcome to Redefining PerspectivesNovember 2012
  • 2. Capital Markets Risk ManagementAnd HadoopKevin Samborn andNitin Agrawal © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 2
  • 3. Agenda• Risk Management• Hadoop• Monte Carlo VaR Implementation•Q&A © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 4
  • 4. Risk Management © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 5
  • 5. What is Risk Management• Risk is a tool – the goal is to optimize and understand risk o Too much risk is locally and systemically dangerous o Too little risk means the firm may be “leaving profit on the table”• Portfolio exposure o Modern portfolios contain many different types of assets o Simple instruments, Complex instruments and derivatives• Many types of risk measures o Defined scenario-based stress testing o Value at Risk (VaR) o “Sensitivities”• Key is valuation under different scenarios• VaR is used in banking regulations, margin calculations and risk management © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 6
  • 6. Value at Risk (VaR)• VaR is a statistical measure of risk – expressed as amount of loss given probability %. E.g. 97.5% chance that the firm will not lose more than 1mill USD over the next 5 days• Computing VaR is a challenging data sourcing and compute intensive process• VaR calculation: o Generate statistical scenarios of market behavior o Revalue the portfolio for each scenario, compare returns to today’s value o Sort results and select the desired percentage return: VALUE AT RISK• Different VaR techniques: o Parametric – analytic approximation o Historical – captures real (historical) market dynamics o Monte Carlo – many scenarios, depends on statistical distributions © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 7
  • 7. VaR GraphicallySource: An Introduction To Value at Risk (VAR), Investopedia, May 2010 © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 8
  • 8. Complexities• For modern financial firms, VaR is complex. Calculation requirements: o Different types of assets require different valuation models • Risk-based approach • Full revaluation o With large numbers of scenarios, many thousands of calculations are required o Monte Carlo simulations require significant calibration, depending on large historical data• Many different reporting dimensions o VaR is not additive across dimensions. Product/asset class, Currency o Portfolio – including “what-if” and intraday activity• Intraday market changes requiring new simulations• Incremental VaR – how does a single (new) trade contribute to the total © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 9
  • 9. Backtesting VaR © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 10
  • 10. © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 11
  • 11. Hadoop Core• Data stored with REDUNDANCY on a • Provides an EASY ABSTRACTION for Distributed File System processing large data sets• Abstracts H/W FAILURES delivering a • Infrastructure for PARALLEL DATA highly-available service on PROCESSING across huge COMMODITY H/W Commodity cluster• SCALES-UP from single to thousands • Infrastructure for TASK and LOAD of nodes MANAGEMENT• Data stored WITHOUT A SCHEMA • Framework achieves DATA-PROCESS• Tuned for SEQUENTIAL DATA ACCESS LOCALITYMakes two critical assumptions though:• Data doesn’t need to be updated• Data doesn’t need to be accessed randomly © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 12
  • 12. A Simple Map Reduce Job Problem Statement: From historical price data, create frequency distribution of 1-day %age change for various stocksStock Date Open Close BP|1, 33 Map 1 S Reduce 1BP 23-Nov 435.25 435.5 O BP|2, 64NXT 23-Nov 3598 3620 R …MKS 23-Nov 378.5 380.7BP 22-Nov 434.8 433.6 TNXT 22-Nov 3579 3603 Map 2 / Reduce 2 NXT|81, 2MKS 22-Nov 377.8 378 S NXT|-20, 5BP 21-Nov 430.75 433 HNXT 21-Nov 3574 3582 … UMKS 21-Nov 375 376 F Reduce 3 Output3BP 20-Nov 430.9 432.25 FNXT 20-Nov 3592 3600MKS 20-Nov 373.7 375.3 Map M LBP 19-Nov 422.5 431.6 ENXT 19-Nov 3560 3600MKS 19-Nov 368.5 372.6 Reduce N Output NBP 16-Nov 423.9 416.6NXT 16-Nov 3575 3542MKS 16-Nov 370.3 366.4BP public void reduce(Text key, Iterable<IntWritable> values, 15-Nov public void map(LongWritable key, Text value, Context 422 425.4NXT Context context) throws IOException, InterruptedException { 15-Nov 3596 3550 context) throwsLong> freqDist InterruptedException { Map<Integer, IOException, = buildFreqDistribution(values);MKS 15-Nov 376.5 370.6 SecurityAttributes sa = Set<Integer> percentChanges = freqDist.keySet(); RecordsReadHelper.readAttribs(value.toString()); for (Integer percentChange : percentChanges) { context.write(new Text(sa.getTicker()), + "|" + percentChange.toString()), context.write(new Text(key.toString() new IntWritable(sa.getPercentChange())); new LongWritable(freqDist.get(percentChange))); } } © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 13
  • 13. Hadoop Ecosystem | How/Where These Fit VISUALIZATION TOOLS USERS DATA WAREHOUSE PROCESSING Sqoop Zoo hiho Keeper Scribe HUE Flume LOAD STORAGE SUPPORT © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 14
  • 14. Monte-Carlo VaR Implementation © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 15
  • 15. Monte Carlo VaR2 Steps IBM … … MSFT … … HLV1 = (∑AiVi) 1 IBM.CO … … HLV2 = (∑AiVi) 2 … … … … … … V1 V2 V3 V10,000 HLV10k= (∑AiVi) 10k Aggregation SIMULATION Aggregation AGGREGATIONChallenges  Daily trade data could be massive  Valuations are Compute intensive  VaR is not a simple arithmetic sum across hierarchies © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 16
  • 16. IBM …… MSFT …… IBM.CO …… …… …… Simulation Step - MapReduce V1 V2 V3 SIMULATIONMAP REDUCE- Read-through portfolio data - For the Underlyer, perform 10k random- Emit (K,V) as walks in parallel(Underlyer,InstrumentDetails) - For each random walk output, simulatee.g. (IBM, IBM.CO.DEC14.225) derivative prices - Emit 10k sets of simulated prices of the stock and associated derivatives i.e. IBM , [V1, V2, …..V10000] IBM.CO.DEC14.225 , [V1, V2, …..V10000]Job job = new Job(getConf());SecurityAttributes stockAttrib = (SecurityAttributes) iter.next();job.setJobName("RandomValuationGenerator");simPricesStock = getSimPricesForStock(stockAttrib);job.setMapperClass(SecurityAttributeMapper.class);writeReducerOutput(stockAttrib, simPricesStock, context);job.setReducerClass(PriceSimulationsReducer.class);…public void BlackScholesMertonPricingOption(); Context context) throws IOException,bsmp = new map(LongWritable key, Text value,InterruptedException { {while (iter.hasNext())SecurityAttributes sa secAttribs = iter.next(); SecurityAttributes = RecordsReadHelper.readAttribs(value.toString()); writeReducerOutput(secAttribs,getSimPricesForOptions( context.write(new Text(sa.getUnderlyer()), sa);} simPricesStock, bsmp, secAttribs), context);} © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 17
  • 17. HLV1 = … iVi)= HLV2 (∑A 1Aggregation Step MapReduce (∑AiVi) 2 …Aggregation AggregationMAP REDUCE- Read-through de-normalized • For the hierarchy level (e.g. US|ERIC),portfolio data perform ∑AiVi for each simulation and- Emit (K,V) as (Hierarchy-level, get simulated portfolio values - HLViPosition Details) • Sort HLVi , find 1%, 5% and 10% valuesUS , [IBM, 225, 191.23] and emit position and VaR dataUS|Tech , [IBM, 400, 191.23]US|Tech|Eric , [IBM, 400, 191.23]Map<String, Double> portfolioPositionData = combineInputForPFPositionData(rows);Map<String, Double[]> simulatedPrices=protected void map(LongWritable key, HoldingWritable value, Context context)loadSimulatedPrices(portfolioPositionData.keySet()); throws java.io.IOException ,InterruptedException {for(long i=0; i<NO_OF_SIMULATIONS-1; i++) { SecurityAttributes sa = RecordsReadHelper.readAttribs(value.toString()); simulatedPFValues.add(getPFSimulatedValue(i, Set<String> hierarchyLevels = sa.getHierarchyLevels();portfolioPositionData, simulatedPrices)); } for (String hierarchyLevel : hierarchyLevels) {Collections.sort(simulatedPFValues); context.write(new Text(hierarchyLevel), newText(sa.getPositionDtls()));simulatedPFValues);emitResults(portfolioPositionData, } © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 18
  • 18. DEMO RUN © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 21
  • 19. Observations• As expected, processing time of Map jobs increased marginally when input data volume was increased• Process was IO-bound on Simulation’s Reduce job as intermediate data emitted was huge• Data replication factor needs to be chosen carefully• MapReduce jobs should be designed such that Map/Reduce output is not huge © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 22
  • 20. Questions?© COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 23
  • 21. Thank You!© COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 24
  • 22. Appendix © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 25
  • 23. © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 26
  • 24. © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 27
  • 25. Let’s build a Simple Map Reduce Job Problem Statement: Across a huge set of documents, we need to find all locations (i.e. document, page, line) for all words having more than 10 characters.DATANODE2 STORAGEDATANODE1 Store Map © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 28
  • 26. © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 29