BigData in Banking
Challenges and Solutions
Arshavsky Andzhey
Director, Big Data dept., SberBank
Avarshavsky.sbt@sberbank.ru
andzhey@mac.com
2015
3
Innovations like killers –
destruction stages of standard banking system
① Internet & social networks
Control and choice
② Screens and Smartphones
Anyplace any time
③ Mobile wallet
Out of cash and plastic cards
④ Accounts without Banks
No bank accounts
⑤ BigData
Cros-system personalization and
targeting
*Бретт Кинг, Банк 3.0
4
BIGDATA as the development of approaches to the use of data
Information like
competition
differentiator
Information like
innovation
enablement
Information as
strategic asset
Information for
business analysys
Data for business
“Day by day
operations”
“Datawarehousing”
Thevalueofinformationforbusiness
“Information in business context”
“Business innovations
based on information”
“Adaptive business strategy”
Information usage methods maturity
+ INTERNET AND OPEN DATA
BIGDATA in Banking
5
BIGDATA In Banking
Information challenges in large Banks (XL)
Data is the most valuable asset in all XL banks
A few know how to apply data for solving even this day
challenges
A few know how to leverage internet, external or open data
sources to understand clients better and attract new
customers
6
The Key challenge with data analysis
Through the development of the Big Data Infrastructure which solves
the challenges with data pre-processing and attribution thru building
intelligent data processing Framework, the company will be able to
optimize labor costs by reducing works on data preparation of data
for the development of business applications up to 70%!
BIGDATA in Banking
It is estimated (by Gartner), 70% of the time spent on analytical projects are
dedicated to bringing, cleaning and data integration, mainly due to the following
problems:
The difficulty of locating data due to the carelessness among disparate business
applications and business systems
To be more than appropriate for analysis, data require reengineering and
reformatting
􏰀The acquisition of data for analysis in a specified format creates a huge burden
on the teams that own the systems data source . Often the same data is
requested or purchase by a variety of departments and business units, which
creates additional work and chaos
The need for process setup regular data exchange
7
Data and Analytics tools as shared resource
Client
Product
Transactions
Location
….
Instruments
RISKS Dept.
RETAIL Dept.
OPERATIONS Dept.
SEQURITY Dept.
CORPORATE
CLIENTS Dept.
HR
BIGDATA in Banking
BIGDATA to a lesser extent, about the data size and is
more about the opportunity to work with many
different data types, formats and applications with
powerful analytic capabilities.
8
Sources of business growth and execution excelence
BIGDATA in Banking
Client
ПРИВЛЕЧЕНИЕ
УДЕРЖАНИЕ
ПРОДАЖИ
ПЕРВИЧНЫЕ
ВТОРИЧНЫЕ
КРЕДИТЫ
РИСКИ
ЗАДОЛЖЕННОСТИ
АНТИФРОД
ВНУТРЕННИЙ
ВНЕШНИЙ
HR
ОПТИМИЗАЦИЯ
ПРОЦЕССОВ
①
②
③ ④
9
Data Factory conception
Big Data Factory should enable data processing in a uniform manner for
all platforms, functions and customers. To build easily changeable and
easy to use data processing operating model with the required level of
trust for both traditional and not so traditional data sources
Tasks: Information trust
Traditional and not so traditional
data sources
BIGDATA in Banking
• Delivery information
• Information integration
(Cleaning, Transformation,
Mapping, Improvement)
• Information search
• Access to information
• Study hypotheses
• Learning models and
information analysis
• Backup/ Cleanup/ Restore
• Administration
• Lifecycle management
• Data quality
• Reference data
• Record linkage and the resolution of
contradictions
• Classification
• Reporting
• Internet data
• Data virtualization
10
ЦК Супермассивов данныхBIGDATA PLATFORM HIGH-LEVEL CONCEPTION
11
BIGDATA in Banking
Data Factory Scenarious
The experts of the subject areas of the Bank's business need to access the
organization's data for research, sampling, annotation and modelling
Data Scientists works on new
models
Marketing is looking for data for the
new compains
Security services looking for data
for drill a suspicious transaction
Retail unit wants to make the
best proposal to the client
……..
Daily activity
The need for ad hoc access to
diverse data
Support analysis and decision
making
To use the terminology subject
matter experts when accessing
data
Providing the same easy access to data in spreadsheets, with the ability to scale to huge
volumes and distribution on a huge variety of types of information while protecting sensitive
information and optimizing it storage systems.
BIGDATA in Banking
Data 2 profit process
Task formalization
DATA
PREPARATION
DATA
EXPLORATION
ADDITIONAL
INDICATORS
ALATITICS &
MODELING
MODEL VALIDATION
MODEL
PRODUCTIZATION
EFFECIENCY
MONITORING
12
①
13
HDFS, row data
Data
exchange
Data preparation, processing and
analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications. Integration.
marts API
BIGDATA in Banking
Possible architecture
14
BI & BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in Banking
Is not expensive - OPEN SOURCE does work
Low cost
No vendor lock
Community support
APPLICATION LAYER
Spark
Hadoop
SQL
NoSQLDB
16
BIGDATA in Banking
Thanks and good luck!

BigData in Banking

  • 1.
    BigData in Banking Challengesand Solutions Arshavsky Andzhey Director, Big Data dept., SberBank Avarshavsky.sbt@sberbank.ru andzhey@mac.com 2015
  • 2.
    3 Innovations like killers– destruction stages of standard banking system ① Internet & social networks Control and choice ② Screens and Smartphones Anyplace any time ③ Mobile wallet Out of cash and plastic cards ④ Accounts without Banks No bank accounts ⑤ BigData Cros-system personalization and targeting *Бретт Кинг, Банк 3.0
  • 3.
    4 BIGDATA as thedevelopment of approaches to the use of data Information like competition differentiator Information like innovation enablement Information as strategic asset Information for business analysys Data for business “Day by day operations” “Datawarehousing” Thevalueofinformationforbusiness “Information in business context” “Business innovations based on information” “Adaptive business strategy” Information usage methods maturity + INTERNET AND OPEN DATA BIGDATA in Banking
  • 4.
    5 BIGDATA In Banking Informationchallenges in large Banks (XL) Data is the most valuable asset in all XL banks A few know how to apply data for solving even this day challenges A few know how to leverage internet, external or open data sources to understand clients better and attract new customers
  • 5.
    6 The Key challengewith data analysis Through the development of the Big Data Infrastructure which solves the challenges with data pre-processing and attribution thru building intelligent data processing Framework, the company will be able to optimize labor costs by reducing works on data preparation of data for the development of business applications up to 70%! BIGDATA in Banking It is estimated (by Gartner), 70% of the time spent on analytical projects are dedicated to bringing, cleaning and data integration, mainly due to the following problems: The difficulty of locating data due to the carelessness among disparate business applications and business systems To be more than appropriate for analysis, data require reengineering and reformatting 􏰀The acquisition of data for analysis in a specified format creates a huge burden on the teams that own the systems data source . Often the same data is requested or purchase by a variety of departments and business units, which creates additional work and chaos The need for process setup regular data exchange
  • 6.
    7 Data and Analyticstools as shared resource Client Product Transactions Location …. Instruments RISKS Dept. RETAIL Dept. OPERATIONS Dept. SEQURITY Dept. CORPORATE CLIENTS Dept. HR BIGDATA in Banking BIGDATA to a lesser extent, about the data size and is more about the opportunity to work with many different data types, formats and applications with powerful analytic capabilities.
  • 7.
    8 Sources of businessgrowth and execution excelence BIGDATA in Banking Client ПРИВЛЕЧЕНИЕ УДЕРЖАНИЕ ПРОДАЖИ ПЕРВИЧНЫЕ ВТОРИЧНЫЕ КРЕДИТЫ РИСКИ ЗАДОЛЖЕННОСТИ АНТИФРОД ВНУТРЕННИЙ ВНЕШНИЙ HR ОПТИМИЗАЦИЯ ПРОЦЕССОВ ① ② ③ ④
  • 8.
    9 Data Factory conception BigData Factory should enable data processing in a uniform manner for all platforms, functions and customers. To build easily changeable and easy to use data processing operating model with the required level of trust for both traditional and not so traditional data sources Tasks: Information trust Traditional and not so traditional data sources BIGDATA in Banking • Delivery information • Information integration (Cleaning, Transformation, Mapping, Improvement) • Information search • Access to information • Study hypotheses • Learning models and information analysis • Backup/ Cleanup/ Restore • Administration • Lifecycle management • Data quality • Reference data • Record linkage and the resolution of contradictions • Classification • Reporting • Internet data • Data virtualization
  • 9.
  • 10.
    11 BIGDATA in Banking DataFactory Scenarious The experts of the subject areas of the Bank's business need to access the organization's data for research, sampling, annotation and modelling Data Scientists works on new models Marketing is looking for data for the new compains Security services looking for data for drill a suspicious transaction Retail unit wants to make the best proposal to the client …….. Daily activity The need for ad hoc access to diverse data Support analysis and decision making To use the terminology subject matter experts when accessing data Providing the same easy access to data in spreadsheets, with the ability to scale to huge volumes and distribution on a huge variety of types of information while protecting sensitive information and optimizing it storage systems.
  • 11.
    BIGDATA in Banking Data2 profit process Task formalization DATA PREPARATION DATA EXPLORATION ADDITIONAL INDICATORS ALATITICS & MODELING MODEL VALIDATION MODEL PRODUCTIZATION EFFECIENCY MONITORING 12 ①
  • 12.
    13 HDFS, row data Data exchange Datapreparation, processing and analytical layer Analytical Views Ad-hoc analytics Development factory Streaming Big Data applications. Integration. marts API BIGDATA in Banking Possible architecture
  • 13.
    14 BI & BIGDATA TraditionalBI Big Data Based on DWH Precession is crucial Flat data scheme Long time 2 market hi-end hardware Based on Hadoop and Spark Any precesion Complex and variable data schemes Ad-hoc analytics Short time 2 market New data sources Low cost Both approaches are valid BIGDATA in Banking
  • 14.
    15 BIGDATA in Banking Isnot expensive - OPEN SOURCE does work Low cost No vendor lock Community support APPLICATION LAYER Spark Hadoop SQL NoSQLDB
  • 15.