Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
1© Cloudera, Inc. All rights reserved.
How Apache Spark and Apache
Hadoop is helping to keep the Banking
regulators happy
2© Cloudera, Inc. All rights reserved.
Agenda
• Existing Architecture for Analytics & Risk
• Ever-changing Regulatory Land...
3© Cloudera, Inc. All rights reserved.
Typical Existing Analytical Architecture
Data Sources
ETL/Staging
EDW
Archive
Data
...
4© Cloudera, Inc. All rights reserved.
Regulatory Landscape
2012 2013 2014 2015 2016 2017 2018 2019
ICB Ring-fencing
ICB L...
5© Cloudera, Inc. All rights reserved.
Existing Architectures under pressure
Limited Data – Incorporating new risk factors...
6© Cloudera, Inc. All rights reserved.
Existing Architectures under pressure
Missed SLA’s for VaR, ES & Stress scenarios
D...
7© Cloudera, Inc. All rights reserved.
Existing Architectures under pressure
Frustrated Quants on the “edge” nodes (not-on...
8© Cloudera, Inc. All rights reserved.
http://www.bis.org/publ/bcbs239.pdf
9© Cloudera, Inc. All rights reserved.
III - Accuracy &
Integrity
Strive for a single
authoritative source for
risk data. ...
10© Cloudera, Inc. All rights reserved.
A modern risk platform calls for…
Scalability
More risk measures, more
scenarios. ...
11© Cloudera, Inc. All rights reserved.
Storage
• Archival
• Traceability
Batch
• ETL
• Data Validation
• Reg Reporting
In...
12© Cloudera, Inc. All rights reserved.
Storage
• Archival
• Traceability
Batch
• ETL
• Data Validation
• Reg Reporting
In...
13© Cloudera, Inc. All rights reserved.
Storage
• Archival
• Traceability
Batch
• ETL
• Data Validation
• Reg Reporting
In...
14© Cloudera, Inc. All rights reserved.
Storage
• Archival
• Traceability
Batch
• ETL
• Data Validation
• Reg Reporting
In...
15© Cloudera, Inc. All rights reserved.
Modern Platform for Analytics and Machine Learning
Data
Sources
EDW
Analytic
Datab...
16© Cloudera, Inc. All rights reserved.
BCBS 239 / FRTB “Illustrative” Architecture
Market Data Revaluation Calculation & ...
17© Cloudera, Inc. All rights reserved.
BCBS 239 – Timeliness (Real-time risk)
Simplifying Lambda architectures with Apach...
18© Cloudera, Inc. All rights reserved.
Metadata
Management
Ingest
Validation
Profiling
Developer Tools: IDEs, Notebooks, ...
19© Cloudera, Inc. All rights reserved.
Risk Footprint with
Apache Spark and Hadoop
o 19 GSIB customers
o 9 banks with ris...
20© Cloudera, Inc. All rights reserved.
Market Risk
aggregation platform
for a Global
Systemically
Important Bank
55x fast...
21© Cloudera, Inc. All rights reserved.
Global Systemically
Important Bank
On-premise and cloud-
based Hadoop clusters
acc...
22© Cloudera, Inc. All rights reserved.
Demo
23© Cloudera, Inc. All rights reserved.
Q&A
Upcoming SlideShare
Loading in …5
×

How Apache Spark and Apache Hadoop are being used to keep banking regulators happy

833 views

Published on

The global financial crisis showed that traditional IT systems at banks were ill equiped to monitor and manage the daily-changing risk landscape during the global financial crisis. The sheer amount of data that needed to be crunched meant that many of the banks were day(s) behind in calculating, understanding and reporting their risk positions. Post crisis, a review by banking regulator, led the regulators to introduce a new legislation BCBS 239: Principles for effective risk data aggregation and reporting, that requires banks to meet more stringent (timeliness) requirement, in their ability to aggregate and report on their quickly-changing risk positions or risk fines to the tune of $millions. To meet these new requirements, banks have been forced to re-think their traditional IT architectures, which are unable to cope with sheer volume of risk data, and are instead turning to Apache Hadoop and Apache Spark to build out next generation of risk systems. In this talk you will discover, how some of the leading banks in the world are leveraging Apache Hadoop and Apache Spark to meet BCBS 239 regulation.

Speaker
Kunal Taneja

Published in: Technology
  • Be the first to comment

How Apache Spark and Apache Hadoop are being used to keep banking regulators happy

  1. 1. 1© Cloudera, Inc. All rights reserved. How Apache Spark and Apache Hadoop is helping to keep the Banking regulators happy
  2. 2. 2© Cloudera, Inc. All rights reserved. Agenda • Existing Architecture for Analytics & Risk • Ever-changing Regulatory Landscape • Challenges with existing architectures • Modern architecture for Financial Risk • Demo of key capabilities
  3. 3. 3© Cloudera, Inc. All rights reserved. Typical Existing Analytical Architecture Data Sources ETL/Staging EDW Archive Data Marts Canned Reports Dashboards/An alytic Applications Non-SQL Workloads Self-Service BI/Ad Hoc
  4. 4. 4© Cloudera, Inc. All rights reserved. Regulatory Landscape 2012 2013 2014 2015 2016 2017 2018 2019 ICB Ring-fencing ICB Loss Absorbency Leverage Ratio - Basel III NSFR – Basel III MiFID II T2S LCR - Basel III ICB / Competition Audit Policy Cross Border Debt Recovery Financial Transaction Tax Market Abuse Directive (MAD II) PRIP Accounting Directive Review AIFM Directive EU Transparency Directive EU Reg on Credit Rating Agencies CRDV Internal Governance GuidelinesFATCA PD EMIR SWAPS Push Out – Dodd Frank Securities Law Directive (SLD) Volker Rule – Dodd Frank Short Selling Close Out Netting Crisis Management Recovery & Resolution Effective dates yet to be confirmed BCBS 239 FRTB
  5. 5. 5© Cloudera, Inc. All rights reserved. Existing Architectures under pressure Limited Data – Incorporating new risk factors Data Sources ETL/Staging EDW Archive Data Marts Canned Reports Dashboards/An alytic Applications Non-SQL Workloads Self-Service BI/Ad Hoc ! Limited Data & Insight • Adding new data source • Risk Factors ! Latent Value • How long to get new reports with new risk factors
  6. 6. 6© Cloudera, Inc. All rights reserved. Existing Architectures under pressure Missed SLA’s for VaR, ES & Stress scenarios Data Sources ETL/Staging EDW Archive Data Marts Canned Reports Dashboards/An alytic Applications Non-SQL Workloads Self-Service BI/Ad Hoc ! Overloaded Bottlenecks * Ever-increasing ETL windows ! Overloaded Bottlenecks * Ever-increasing batch windows to extract data
  7. 7. 7© Cloudera, Inc. All rights reserved. Existing Architectures under pressure Frustrated Quants on the “edge” nodes (not-only-sql) Data Sources ETL/Staging EDW Archive Data Marts Canned Reports Dashboards/An alytic Applications Non-SQL Workloads Self-Service BI/Ad Hoc ! Lack of Tooling * Ad-hoc, on-demand complex risk modeling requirements
  8. 8. 8© Cloudera, Inc. All rights reserved. http://www.bis.org/publ/bcbs239.pdf
  9. 9. 9© Cloudera, Inc. All rights reserved. III - Accuracy & Integrity Strive for a single authoritative source for risk data. Aggregate on an automated basis. IV - Completeness Capture and aggregate all material risk data. Data available by business line, legal entity, asset type, industry, region.… V - Timeliness Generate aggregate and up-to-date risk data in a timely manner. VI - Adaptability Meet a broad range of on-demand, ad-hoc risk management reporting requests. BCBS-239: Principles for Risk Data Aggregation • Data, models and processes live in silos • Hard to get enterprise wide view of risk • Difficult to aggregate • Lack of enterprise data taxonomy • Failed audits • Aggregate / reported risk data is infrequent and stale • Unable to handle crisis situations • Complex risk modeling process • Unable to handle crisis situations
  10. 10. 10© Cloudera, Inc. All rights reserved. A modern risk platform calls for… Scalability More risk measures, more scenarios. Fine-grained risk data result in an order of magnitude increase in volume. Speed More frequent stress testing and regulatory reporting. High velocity scenario development and deployment. Agility More frequent stress testing and Support for variety of languages. Pre-trade decisions. “What-if” scenarios. Transparency Verifiable data. Timely response to audits. Data quality and lineage. Data and model governance.
  11. 11. 11© Cloudera, Inc. All rights reserved. Storage • Archival • Traceability Batch • ETL • Data Validation • Reg Reporting Interactive • Risk Aggregation • Stress Testing HPC • Risk Modeling • Backtesting • Simulation Streaming & Real Time • Mkt Surveillance • Best Execution Evolution towards a modern risk platform Risk & Regulatory Compliance Use Cases on Hadoop HDFS High-throughput, scalable, fault-tolerant, distributed file system. MapReduce Distributed parallel processing frameworks.
  12. 12. 12© Cloudera, Inc. All rights reserved. Storage • Archival • Traceability Batch • ETL • Data Validation • Reg Reporting Interactive • Risk Aggregation • Stress Testing HPC • Risk Modeling • Backtesting • Simulation Streaming & Real Time • Mkt Surveillance • Best Execution Apache Impala Massively Parallel Processing (MPP) SQL engine. Apache Spark In-memory distributed processing framework. Evolution towards a modern risk platform Risk & Regulatory Compliance Use Cases on Hadoop
  13. 13. 13© Cloudera, Inc. All rights reserved. Storage • Archival • Traceability Batch • ETL • Data Validation • Reg Reporting Interactive • Risk Aggregation • Stress Testing HPC • Risk Modeling • Backtesting • Simulation Streaming & Real Time • Mkt Surveillance • Best Execution Apache Spark Distributed compute framework. Can support Python / C++, as well as Java and Scala. Data Science Workbench Fully integrated data science notebook application. Cloudera Data Science Workbench Evolution towards a modern risk platform Risk & Regulatory Compliance Use Cases on Hadoop
  14. 14. 14© Cloudera, Inc. All rights reserved. Storage • Archival • Traceability Batch • ETL • Data Validation • Reg Reporting Interactive • Risk Aggregation • Stress Testing HPC • Risk Modeling • Backtesting • Simulation Streaming & Real Time • Mkt Surveillance • Best Execution Cloudera Data Science Workbench Apache Kudu Real-time streaming architectures for true Aggregated Risk of Demand Evolution towards a modern risk platform Risk & Regulatory Compliance Use Cases on Hadoop
  15. 15. 15© Cloudera, Inc. All rights reserved. Modern Platform for Analytics and Machine Learning Data Sources EDW Analytic Database Operational Database Data Science & Engineering Shared Data Layer Modern Data Platform Fixed Reports Dashboards/ Analytic Applications Non-SQL Workloads Self- Service BI/Ad Hoc Flexible Reporting MiFID II, FRTB, IFRS-9, BCBS-239, MAD/MAR, GDPR, ….
  16. 16. 16© Cloudera, Inc. All rights reserved. BCBS 239 / FRTB “Illustrative” Architecture Market Data Revaluation Calculation & Aggregation Reporting Market Data Feeds IPV Independent Price Valuation Function MRF / NMRF Modelable & Non- Modelable Risk Factors Calibration Fixed Income Front Office Pricing Engines Equity Mkts Front Office Pricing Engines FX Front Office Pricing Engines … Other Mkts Front Office Pricing Engines Enterprise Data Hub Static Data Market Data Configuration P&L Vectors Sensitivities Events Positions & Transaction Data Scenarios - Current - Historic - Stressed - Projected Risk Metrics SA-related Risk Components Counter-Party Credit Risk XVA ES & Stressed ES P&L Attribution VaR Regulatory Applications MiFID 2 Stress Testing GDPR FRTB SA FRTB IMA EMIR Regulatory Reporting Management Reporting Scenarios RiskSensitivities
  17. 17. 17© Cloudera, Inc. All rights reserved. BCBS 239 – Timeliness (Real-time risk) Simplifying Lambda architectures with Apache Kudu Kafka Spark Streaming Kudu Spark MLlib Application Data Sources Individual Session Full Model/Learning Genesis Real-time Risk with Greeks 1 Event Occurs 2 Market Data 3 Stream Processin g 4 Land in RDBMS 5 Batch Valuation
  18. 18. 18© Cloudera, Inc. All rights reserved. Metadata Management Ingest Validation Profiling Developer Tools: IDEs, Notebooks, SCM Operations Tools: Scheduling, Workflow, Publishing Data Management Exploration / Model Development Production / Model Deployment Feature Engineering Model Training & Testing Visualization Production Feature Generation Production Model Port Production Testing Result Validation Serving User: Data Engineer User: Quant Analyst Users: Data / Dev / Ops Engineer Modern Platform for Analytics and Machine Learning Supporting complete development lifecycle for risk
  19. 19. 19© Cloudera, Inc. All rights reserved. Risk Footprint with Apache Spark and Hadoop o 19 GSIB customers o 9 banks with risk use cases in production o 6000+ nodes deployed o >5 years in production
  20. 20. 20© Cloudera, Inc. All rights reserved. Market Risk aggregation platform for a Global Systemically Important Bank 55x faster processing, 8x more data capacity 300+ daily interactive users analyzing current and historical data
  21. 21. 21© Cloudera, Inc. All rights reserved. Global Systemically Important Bank On-premise and cloud- based Hadoop clusters according to workload. Tested on AWS to 40,000 cores. Demonstrated linear scaling of simulation workloads.
  22. 22. 22© Cloudera, Inc. All rights reserved. Demo
  23. 23. 23© Cloudera, Inc. All rights reserved. Q&A

×