Towards intelligent data insights in central banks
Roma, 24th Feb 2017
Luigi Bellomarini, IT Department
1
Challenges and opportunities for declarative languages
WHAT	INSIGHTS
2
• Credit and credit risk data & Institutional Units register
• Securities Holdings Statistics & Securities register
• Monetary Financial Institutions Balance Sheets Statistics
• Monetary Financial Institutions Interest Rates Statistics
• Balance of Payments
• National Accounts
• Single Supervisory Mechanism
• Regulatory frameworks
THE	DATA	PRODUCTION	PROCESS
3
BANKSINPUTLAYER
Banks
Operational
Systems
BANKSOUTPUTLAYER
PRIMARYREPORTING
Transformations by
banks
NATIONALSTATIST.PRODUCTION
SECONDARYREPORTING
SUPRANATIONAL
STATISTICALPRODUCTION
Transformations by
central banks
Transformations by
international institutions
STANDARDIZATION
4
• Of processes, models and languages
• Guide the process in the banks
• Extract the data into harmonized models
• Standardize the transformations
• Validation & Trasformation Language (VTL)
GSBPM
Information ModelProcess Model
VTL
Operand Operand
Expression
Result
VTL:	A	STANDARD	LANGUAGE	(FROM	SDMX	INITIATIVE)
5
High-level and business oriented
• Fully declarative approach
• Logic and functional paradigms
Mathematical functions are first-class objects
• VTL manipulates data as mathematical functions
• Based on operators (higher-order functions)
Sector
City
Reference Date
Loans = 9.876.543
Deposits = 10.234.567
Loans value type = measured
Deposits value type = estimated
Naples
Private
31 Dec
2010
Dimensions Measures Attributes
City Reference
Date
Sector Loans Deposits
Loans
value
type
Deposits
value type
Naples 2010 12 31 private 9.876.543 10.234.567 measured estimated
Naples 2010 12 31 public 543.210 654.321 measured measured
Naples 2009 12 31 private 9.210.876 10.987.654 estimated estimated
Naples 2009 12 31 public 876.543 1.654.123 measured measured
… … … … … …
Rome 2010 12 31 private 1.234.567 1.546.897 measured measured
… … … … … … ,,,
VTL	– A	GRAPH	OF	TRANSFORMATIONS
6
Banks & OFIs reports …
D1
D2
D3
D4
D5
T1
T3
T2
D10
D12
D13
D15
D17
D16T13
T12
T1
4
Other data sources
D51
D52
T53
T52
T51
Economic research models
D54
D53
T54
C.C.R.
D21
D22
D23
D24T22
T21
D60
D61
Statistical bulletin
T60
T61
Statistical products
D70
T71
T70
T72D71
D72
D41
T42
T41
D42
Supervision models
D4 = get ( D1_LOANS_FLOW, keep (DATE, BANK, AMOUNT), sum (AMOUNT))
D5 = get ( D2_LOANS_STOCK, keep (DATE, BANK, AMOUNT), sum (AMOUNT))
D6_CHECK = check( D5 = lag(D5, -1) + D4)
It’s a DAG!
EXECUTION	PLATFORMS	- @Bank	of	Italy
7
VTL
RGDP := PQR * RGDPPC
tmp <- merge(PQR,RGDPPC,by=c("q","r")) tmp$i <-
tmp["p"] * tmp["g"]
TGDP <- tmp[-c("p","g")]
Rgdp = get_tab(pqr * rgdppc)
INSERT INTO RGDP(Q,R,P)
SELECT C2.Q AS Q, C2.R AS R, C1.P*C2.G AS P
FROM PQR C1 , RGDPPC C2
WHERE C1.Q = C2.Q AND C1.R = C2.R
PQR(q,r,p), RGDPPC(q,r,g) à ∃𝑧	RGDP(q,r, z)
User specification
Logical representation
IT implementation
INFOSTAT
8
FROM	A	SET	OF	RULES	...	TOWARDS	A	KNOWLEDGE	BASE
9
Metadata-driven system
Declarative representation
Active dictionary
Integrated approach
NO INFERENCE
• Knowledge base
generation
• First Order Languages
REASONING
AI
Cognitive Computing
CREDITS
10
• SDMX TWG and VTL task force (www.sdmx.org)
• Statistical Data and Concept Representation
The Banca d’Italia’s Active Statistical Metainformation System,
Modelling Levels in the Statistical Information System of Bank of Italy,
The “Matrix” Model: unified model for statistical representation and proces
sing, Vincenzo Del Vecchio, Fabio Di Giovanni et al. (https://www.bancaditalia.it/
statistiche/raccolta-dati/sistema-informativo-statistico/index.html)
• BIRD project (http://banks-integrated-reporting-dictionary.eu)
• GSBPM (http://www1.unece.org/stat/platform/display/GSBPM/GSBPM+v5.0)
https://creativecommons.org/licenses/by-nc-sa/3.0/

Towards intelligent data insights in central banks: challenges and opportunities for declarative languages - Luigi Bellomarini

  • 1.
    Towards intelligent datainsights in central banks Roma, 24th Feb 2017 Luigi Bellomarini, IT Department 1 Challenges and opportunities for declarative languages
  • 2.
    WHAT INSIGHTS 2 • Credit andcredit risk data & Institutional Units register • Securities Holdings Statistics & Securities register • Monetary Financial Institutions Balance Sheets Statistics • Monetary Financial Institutions Interest Rates Statistics • Balance of Payments • National Accounts • Single Supervisory Mechanism • Regulatory frameworks
  • 3.
  • 4.
    STANDARDIZATION 4 • Of processes,models and languages • Guide the process in the banks • Extract the data into harmonized models • Standardize the transformations • Validation & Trasformation Language (VTL) GSBPM Information ModelProcess Model VTL Operand Operand Expression Result
  • 5.
    VTL: A STANDARD LANGUAGE (FROM SDMX INITIATIVE) 5 High-level and businessoriented • Fully declarative approach • Logic and functional paradigms Mathematical functions are first-class objects • VTL manipulates data as mathematical functions • Based on operators (higher-order functions) Sector City Reference Date Loans = 9.876.543 Deposits = 10.234.567 Loans value type = measured Deposits value type = estimated Naples Private 31 Dec 2010 Dimensions Measures Attributes City Reference Date Sector Loans Deposits Loans value type Deposits value type Naples 2010 12 31 private 9.876.543 10.234.567 measured estimated Naples 2010 12 31 public 543.210 654.321 measured measured Naples 2009 12 31 private 9.210.876 10.987.654 estimated estimated Naples 2009 12 31 public 876.543 1.654.123 measured measured … … … … … … Rome 2010 12 31 private 1.234.567 1.546.897 measured measured … … … … … … ,,,
  • 6.
    VTL – A GRAPH OF TRANSFORMATIONS 6 Banks &OFIs reports … D1 D2 D3 D4 D5 T1 T3 T2 D10 D12 D13 D15 D17 D16T13 T12 T1 4 Other data sources D51 D52 T53 T52 T51 Economic research models D54 D53 T54 C.C.R. D21 D22 D23 D24T22 T21 D60 D61 Statistical bulletin T60 T61 Statistical products D70 T71 T70 T72D71 D72 D41 T42 T41 D42 Supervision models D4 = get ( D1_LOANS_FLOW, keep (DATE, BANK, AMOUNT), sum (AMOUNT)) D5 = get ( D2_LOANS_STOCK, keep (DATE, BANK, AMOUNT), sum (AMOUNT)) D6_CHECK = check( D5 = lag(D5, -1) + D4) It’s a DAG!
  • 7.
    EXECUTION PLATFORMS - @Bank of Italy 7 VTL RGDP :=PQR * RGDPPC tmp <- merge(PQR,RGDPPC,by=c("q","r")) tmp$i <- tmp["p"] * tmp["g"] TGDP <- tmp[-c("p","g")] Rgdp = get_tab(pqr * rgdppc) INSERT INTO RGDP(Q,R,P) SELECT C2.Q AS Q, C2.R AS R, C1.P*C2.G AS P FROM PQR C1 , RGDPPC C2 WHERE C1.Q = C2.Q AND C1.R = C2.R PQR(q,r,p), RGDPPC(q,r,g) à ∃𝑧 RGDP(q,r, z) User specification Logical representation IT implementation
  • 8.
  • 9.
    FROM A SET OF RULES ... TOWARDS A KNOWLEDGE BASE 9 Metadata-driven system Declarative representation Activedictionary Integrated approach NO INFERENCE • Knowledge base generation • First Order Languages REASONING AI Cognitive Computing
  • 10.
    CREDITS 10 • SDMX TWGand VTL task force (www.sdmx.org) • Statistical Data and Concept Representation The Banca d’Italia’s Active Statistical Metainformation System, Modelling Levels in the Statistical Information System of Bank of Italy, The “Matrix” Model: unified model for statistical representation and proces sing, Vincenzo Del Vecchio, Fabio Di Giovanni et al. (https://www.bancaditalia.it/ statistiche/raccolta-dati/sistema-informativo-statistico/index.html) • BIRD project (http://banks-integrated-reporting-dictionary.eu) • GSBPM (http://www1.unece.org/stat/platform/display/GSBPM/GSBPM+v5.0) https://creativecommons.org/licenses/by-nc-sa/3.0/