• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Risk managementusinghadoop
 

Risk managementusinghadoop

on

  • 221 views

 

Statistics

Views

Total Views
221
Views on SlideShare
221
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Risk managementusinghadoop Risk managementusinghadoop Presentation Transcript

    • Welcome to Redefining PerspectivesNovember 2012
    • Capital Markets Risk ManagementAnd HadoopKevin Samborn andNitin Agrawal © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 2
    • Agenda• Risk Management• Hadoop• Monte Carlo VaR Implementation•Q&A © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 4
    • Risk Management © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 5
    • What is Risk Management• Risk is a tool – the goal is to optimize and understand risk o Too much risk is locally and systemically dangerous o Too little risk means the firm may be “leaving profit on the table”• Portfolio exposure o Modern portfolios contain many different types of assets o Simple instruments, Complex instruments and derivatives• Many types of risk measures o Defined scenario-based stress testing o Value at Risk (VaR) o “Sensitivities”• Key is valuation under different scenarios• VaR is used in banking regulations, margin calculations and risk management © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 6
    • Value at Risk (VaR)• VaR is a statistical measure of risk – expressed as amount of loss given probability %. E.g. 97.5% chance that the firm will not lose more than 1mill USD over the next 5 days• Computing VaR is a challenging data sourcing and compute intensive process• VaR calculation: o Generate statistical scenarios of market behavior o Revalue the portfolio for each scenario, compare returns to today’s value o Sort results and select the desired percentage return: VALUE AT RISK• Different VaR techniques: o Parametric – analytic approximation o Historical – captures real (historical) market dynamics o Monte Carlo – many scenarios, depends on statistical distributions © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 7
    • VaR GraphicallySource: An Introduction To Value at Risk (VAR), Investopedia, May 2010 © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 8
    • Complexities• For modern financial firms, VaR is complex. Calculation requirements: o Different types of assets require different valuation models • Risk-based approach • Full revaluation o With large numbers of scenarios, many thousands of calculations are required o Monte Carlo simulations require significant calibration, depending on large historical data• Many different reporting dimensions o VaR is not additive across dimensions. Product/asset class, Currency o Portfolio – including “what-if” and intraday activity• Intraday market changes requiring new simulations• Incremental VaR – how does a single (new) trade contribute to the total © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 9
    • Backtesting VaR © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 10
    • © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 11
    • Hadoop Core• Data stored with REDUNDANCY on a • Provides an EASY ABSTRACTION for Distributed File System processing large data sets• Abstracts H/W FAILURES delivering a • Infrastructure for PARALLEL DATA highly-available service on PROCESSING across huge COMMODITY H/W Commodity cluster• SCALES-UP from single to thousands • Infrastructure for TASK and LOAD of nodes MANAGEMENT• Data stored WITHOUT A SCHEMA • Framework achieves DATA-PROCESS• Tuned for SEQUENTIAL DATA ACCESS LOCALITYMakes two critical assumptions though:• Data doesn’t need to be updated• Data doesn’t need to be accessed randomly © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 12
    • A Simple Map Reduce Job Problem Statement: From historical price data, create frequency distribution of 1-day %age change for various stocksStock Date Open Close BP|1, 33 Map 1 S Reduce 1BP 23-Nov 435.25 435.5 O BP|2, 64NXT 23-Nov 3598 3620 R …MKS 23-Nov 378.5 380.7BP 22-Nov 434.8 433.6 TNXT 22-Nov 3579 3603 Map 2 / Reduce 2 NXT|81, 2MKS 22-Nov 377.8 378 S NXT|-20, 5BP 21-Nov 430.75 433 HNXT 21-Nov 3574 3582 … UMKS 21-Nov 375 376 F Reduce 3 Output3BP 20-Nov 430.9 432.25 FNXT 20-Nov 3592 3600MKS 20-Nov 373.7 375.3 Map M LBP 19-Nov 422.5 431.6 ENXT 19-Nov 3560 3600MKS 19-Nov 368.5 372.6 Reduce N Output NBP 16-Nov 423.9 416.6NXT 16-Nov 3575 3542MKS 16-Nov 370.3 366.4BP public void reduce(Text key, Iterable<IntWritable> values, 15-Nov public void map(LongWritable key, Text value, Context 422 425.4NXT Context context) throws IOException, InterruptedException { 15-Nov 3596 3550 context) throwsLong> freqDist InterruptedException { Map<Integer, IOException, = buildFreqDistribution(values);MKS 15-Nov 376.5 370.6 SecurityAttributes sa = Set<Integer> percentChanges = freqDist.keySet(); RecordsReadHelper.readAttribs(value.toString()); for (Integer percentChange : percentChanges) { context.write(new Text(sa.getTicker()), + "|" + percentChange.toString()), context.write(new Text(key.toString() new IntWritable(sa.getPercentChange())); new LongWritable(freqDist.get(percentChange))); } } © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 13
    • Hadoop Ecosystem | How/Where These Fit VISUALIZATION TOOLS USERS DATA WAREHOUSE PROCESSING Sqoop Zoo hiho Keeper Scribe HUE Flume LOAD STORAGE SUPPORT © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 14
    • Monte-Carlo VaR Implementation © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 15
    • Monte Carlo VaR2 Steps IBM … … MSFT … … HLV1 = (∑AiVi) 1 IBM.CO … … HLV2 = (∑AiVi) 2 … … … … … … V1 V2 V3 V10,000 HLV10k= (∑AiVi) 10k Aggregation SIMULATION Aggregation AGGREGATIONChallenges  Daily trade data could be massive  Valuations are Compute intensive  VaR is not a simple arithmetic sum across hierarchies © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 16
    • IBM …… MSFT …… IBM.CO …… …… …… Simulation Step - MapReduce V1 V2 V3 SIMULATIONMAP REDUCE- Read-through portfolio data - For the Underlyer, perform 10k random- Emit (K,V) as walks in parallel(Underlyer,InstrumentDetails) - For each random walk output, simulatee.g. (IBM, IBM.CO.DEC14.225) derivative prices - Emit 10k sets of simulated prices of the stock and associated derivatives i.e. IBM , [V1, V2, …..V10000] IBM.CO.DEC14.225 , [V1, V2, …..V10000]Job job = new Job(getConf());SecurityAttributes stockAttrib = (SecurityAttributes) iter.next();job.setJobName("RandomValuationGenerator");simPricesStock = getSimPricesForStock(stockAttrib);job.setMapperClass(SecurityAttributeMapper.class);writeReducerOutput(stockAttrib, simPricesStock, context);job.setReducerClass(PriceSimulationsReducer.class);…public void BlackScholesMertonPricingOption(); Context context) throws IOException,bsmp = new map(LongWritable key, Text value,InterruptedException { {while (iter.hasNext())SecurityAttributes sa secAttribs = iter.next(); SecurityAttributes = RecordsReadHelper.readAttribs(value.toString()); writeReducerOutput(secAttribs,getSimPricesForOptions( context.write(new Text(sa.getUnderlyer()), sa);} simPricesStock, bsmp, secAttribs), context);} © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 17
    • HLV1 = … iVi)= HLV2 (∑A 1Aggregation Step MapReduce (∑AiVi) 2 …Aggregation AggregationMAP REDUCE- Read-through de-normalized • For the hierarchy level (e.g. US|ERIC),portfolio data perform ∑AiVi for each simulation and- Emit (K,V) as (Hierarchy-level, get simulated portfolio values - HLViPosition Details) • Sort HLVi , find 1%, 5% and 10% valuesUS , [IBM, 225, 191.23] and emit position and VaR dataUS|Tech , [IBM, 400, 191.23]US|Tech|Eric , [IBM, 400, 191.23]Map<String, Double> portfolioPositionData = combineInputForPFPositionData(rows);Map<String, Double[]> simulatedPrices=protected void map(LongWritable key, HoldingWritable value, Context context)loadSimulatedPrices(portfolioPositionData.keySet()); throws java.io.IOException ,InterruptedException {for(long i=0; i<NO_OF_SIMULATIONS-1; i++) { SecurityAttributes sa = RecordsReadHelper.readAttribs(value.toString()); simulatedPFValues.add(getPFSimulatedValue(i, Set<String> hierarchyLevels = sa.getHierarchyLevels();portfolioPositionData, simulatedPrices)); } for (String hierarchyLevel : hierarchyLevels) {Collections.sort(simulatedPFValues); context.write(new Text(hierarchyLevel), newText(sa.getPositionDtls()));simulatedPFValues);emitResults(portfolioPositionData, } © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 18
    • DEMO RUN © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 21
    • Observations• As expected, processing time of Map jobs increased marginally when input data volume was increased• Process was IO-bound on Simulation’s Reduce job as intermediate data emitted was huge• Data replication factor needs to be chosen carefully• MapReduce jobs should be designed such that Map/Reduce output is not huge © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 22
    • Questions?© COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 23
    • Thank You!© COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 24
    • Appendix © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 25
    • © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 26
    • © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 27
    • Let’s build a Simple Map Reduce Job Problem Statement: Across a huge set of documents, we need to find all locations (i.e. document, page, line) for all words having more than 10 characters.DATANODE2 STORAGEDATANODE1 Store Map © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 28
    • © COPYRIGHT 2012 SAPIENT CORPORATION | CONFIDENTIAL 29