Successfully reported this slideshow.
Stephanie Rendón de la Torre
Ph.D. TALLINN UNIVERSITY OF TECHNOLOGY / Sr. Data Scientist- Swedbank
January 16th, 2020
Stephanie Rendón de la Torre
TALLINN UNIVERSITY OF TECHNOLOGY
/ Sr. Data Scientist- Swedbank
January 16th, 2020
From econophysics to networks to data
science: Estonian network of payments
• Introduction to econophysics
• Introduction to complex networks
• Future and current work
• Econophysics is an interdisciplinary research field, applying the methods of physics to
economical problems, e.g. describing and understanding the market behaviour.
• Application of physics to the field of economics and finance.
• Basic tools: statistical physics
• The term “econophysics” was coined by H. Eugene Stanley in 1995 in Kolkata.
• Before the term ‘Econophysics’ was coined many people from different branches of science
had worked and applied their knowledge in the field of economics leading to evolution of
econophysics: 90’s boom!
• Mutual attraction between Physics and Economics
○ The nbr of physicists working in economic problems has increased dramatically in the last 15-20 years
• Much better economical data now / genuine interest in complex systems.
• Key year: 1973 currencies in traded financial markets. Daily $5.3trillionUSD 100 days of
NYSE trading = 1 days in FOREX
• Black-Scholes Model – Nobel Prize
• 80s= electronic trading! = data! Lots of data!
• Today: Big Data where data is the new oil
Money is a gas!
Distribution of money in a country
• Imagine money was a gas…
• Few people have a lot of money and a lot of people have little money
• Boltzmann-Gibbs distribution fits most of the data
• Income inequality is high
• Build models to explain why is this happening
… „There is a great temptation to consider the
exchanges of money which occur in economic
interaction as analogous to the exchanges of
energy which occur in physical shocks between
Complex problem or complicated problem?
• Quantity of information needed to specify a system
• Quality: It is what makes the system complex and it has something to do
with the ability to understand a system; it refers to the existence of
emergent properties, which appear as a consequence of the interactions
of the components of the system.
• Complicated ≠ Complex
What is a system?
• Set of entities that form a unified whole through their interactions. A system is defined in terms of
its boundary, which determines the components that are or are not part of the system.
• Complex system: Composed of many elements which may interact which each other. These
interactions give rise to collective behaviors.
Milgram, Psych Today 2, 60 (1967)
Dodds et al., Science 301, 827 (2003)
“Six degrees of separation”
Small world network
I really do know
Why model networks?
• Simpler representation of possibly very complex structures
• Can gain insight into how networks form and how they grow
• May allow mathematical derivation of certain properties
• Can serve to “explain” certain properties observed in real networks
• Can predict new properties or outcomes for networks that do not even exist
• Can serve as benchmarks for evaluating real networks
Why is Network Analysis relevant?
• Customers may influence other customers (their contacts) with
• Into which customers to focus and which to avoid
• What can we give to customers (from the bank's perspective)?
• What can we get from them?
• Often ignored is the importance of the interactions a customer might have with other
customer, and the inevitable influence their contacts have on one another.
Some applications for industry
•Reduce churnincrease chales
•Identify key customers-> Pinpoint influential customers
•Additional marketing opportunities
•Customer behaviors and pattern identification
•Analyzing the spread of contagion (marketing “buzz” effect)
•Targeting for offering Offer a product to specific influential customers in terms of
importance in the network of connections.
•Offering products and services: is A uses product/service X, how likely is it to be taken up by B?
•Antimoney laundering procedures - fraud
• Network science research in finances and economics: Huge potential.
• Big data era changes everything!
• Network science is an active interdisciplinary research field, originated
from mathematics branch: graph theory, extended into many directions..
• Complex networks can be biological, technological, economic, social…
• With complex networks it is possible to describe the structure of systems that are suitable to
be represented as graphs.
• Networks play an important role in a wide range of economic and social phenomena. The
use of techniques and methods from graph theory has permitted economic networks studies
to expand the knowledge and give insights into fin/soc/eco phenomena.
• To study the structure (characteristics and dynamics) of the Estonian network of payments through
analysis of different experiments that involve:
– Global and local topology
– Community detection
– Fractal and multifractal properties
• This research work presents an extensive study that contributes to the field of complex networks by
adding empirical evidence with a new, unique and very interesting study case.
• The first study on economic development of a country from a complex network approach, through
Explore local, global, mesoscale
structures by using known methodologies
• Obtained from Swedbank’s databases.
• The data set is unique in its kind and very
interesting: ~80% of EE bank transactions
are executed through Swb system of
payments; hence, this data set reasonably
reproduces the structure of the EE
economy and can be used as a proxy of it.
• Domestic payments (company-to company
electronic transactions) of 2014.
• 16,613 nodes, 2,617,478 payment
transactions, and 43,375 links.
• Nodes = Estonian companies
• Links = payments done between the
Total companies analysed (N) 16,613
Total number of payments analysed 2,617,478
Total value of transactions 3,803,462,026 *
Average value of transaction per customer 87,600 *
Maximum value of a transaction 121,533 *
Minimum value of a transaction (aggregated) 1,000 *
Average volume of transaction per company 60
Maximum volume of transaction per company 24,859
Minimum volume of transaction per company 20
*All money quantities are expressed in monetary units and not in real currencies in order
to protect the confidentiality of the data set. The purpose of showing monetary units is to
provide a notion of the proportions of quantities and not to show exact amounts of
Two nodes are connected if there was money transaction
• Links can have weights attached to them /directed or undirected
• The simplest quantity observed in a network: degree. It measures how important is a node with respect
to its nearest neighbors. The degree of a node is # of neighbours of that node and is defined as
𝑘𝑘𝑖𝑖 = �
the sum runs over the set 𝜁𝜁 𝑖𝑖 of neighbours of 𝑖𝑖. For example: 𝜁𝜁 𝑖𝑖 = �𝑗𝑗|𝑎𝑎𝑖𝑖𝑖𝑖 = }1 .
In a directed network there are 2 characteristics of a node, # links that end at a node and # of links
that start from the node. These quantities are known as the out-degree 𝑘𝑘 𝑜𝑜
and the in-degree 𝑘𝑘 𝑑𝑑
node, and are defined as
, 𝑘𝑘 𝑜𝑜
• DD: Is the simplest statistical characteristic of a network; it characterizes only local properties of the network,
even this info is sufficient to determine basic properties.
• It is possible to categorize networks by the degree distributions of their tails.
• Degree distributions of real-world networks are different when compared to random networks
• RN commonly show PDD, while real networks might have long tails in the right part of the
distribution with values that are far above the mean.
• Measuring the tail of the distribution of the degree could be achieved by building a plot of the
• This type of distribution is called scale-free: No natural scale.
𝑃𝑃 ( 𝑘𝑘 ) ∼ 𝑘𝑘 𝛾𝛾
• Square matrix representation of the image of the network.
• For a simple network with node set A, the adjacency matrix is a square |N| × |N|
matrix such that its element Bij is one when there is a link from node i to node j, and
zero when there is no link.
𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑 (𝑜𝑜𝑜𝑜𝑜𝑜 𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑𝑑)
Translate everything into
• This network has SF properties (DD, and statistical distributions of the community structures: size, overlap and
• The power-law tail signals that the prob. of finding companies paying out very large quantities of $ is small. Moreover,
while companies have absolute freedom in choosing how much $ to pay to their CP with whom they interact with, the
overall system obeys a scaling law, which is a particular property of critical phenomena in highly interactive self-
• Small-world (7 degrees of separation). Low clustering coefficient (0.18)
• Disassortative (resiliency high degree nodes connect with low degree nodes).
(a) degree distribution for the connectivity network of the Estonian network of payments. X axis is the number of 𝑘𝑘 degrees and Y axis is 𝑃𝑃 𝑘𝑘 . 𝑃𝑃 𝑘𝑘 ∼ 𝑘𝑘−2.4
(b) out-degree distribution of the network, 𝑃𝑃 ( 𝑘𝑘 ) ∼ 𝑘𝑘−2 .39
. c) in-degree distribution 𝑃𝑃 𝑘𝑘 ∼ 𝑘𝑘−2 .49
. Node out-degree distribution by strength. 𝑃𝑃 ( 𝑘𝑘 ) ∼
displays the link weight distribution (volume:number of payments transacted). probability P(s) that a company has k outgoing links.
• Robustness tests: centralities and collective influencer nodes (Morone & Makse)
– I found the nodes that prevent the network from breaking into disconnected components (Percolat. thresh.= 8%).
– The most influential companies in the network are not necessarily those which have more economic activity. Only
a small number of companies maintain the unity of the network.
– SFN: Robustness of the network against random attacks but it also revealed its vulnerability to targeted attacks.
– Financial systems show this pattern: System collapse when few nodes collapse.
Plots of the effect of the targeted and random removal of nodes from the network of payments. (a) The average shortest-path length < l > in the GCC
plotted against the percentage of removed nodes. (b) The GCC plotted against the percentage of removed nodes. Continuous lines display the effect
of the targeted removal of nodes and the dashed lines display the effect of the random removal of nodes. Pc are the percolation thresholds, for each
• First MFA of a complex network of payments where specific fractal and MF properties were studied
• One can study a simplified vers. of the network (skeleton networks) and still capture general structure of the original network .
• SkN had a slightly smaller FD than the original network, and they both were very similar: SkN preserves structure while
• Fractal scaling analysis by estimating FD of the network and skeleton. Then, study MF behaviour by using a sandbox algorithm
to calculate the spectrum of 𝐷𝐷 𝑞𝑞 and 𝜏𝜏 𝑞𝑞 .
• 𝑁𝑁𝐵𝐵(𝑟𝑟𝐵𝐵)~𝑟𝑟 𝐵𝐵
24Graph representation of the skeleton of the Estonian network of payments.
Fractal scaling representation of the network. The original network (o) and the skeleton
network (●). The straight line is included for guidance and has a slope of 2.3.
Lateral size of the
FD is the absolute value of the
slope of the linear fit
• Study general MF structure and explore statistical measures: sandbox algorithm.
• Fixed size box counting algorithm, one of the most efficient and known for MFA adapted for networks.
• I calculated the spectrum of the ME and GFD curves. Results: Estonian economy is MF.
• Large values of 𝐷𝐷(𝑞𝑞) spectra and this means that the distribution of links is quite irregular in the network,
suggesting there are hubs contrasting with other nodes holding few links. This structure could be relevant when
specific critical events occur in the economy that could threaten the whole network.
• MF of a complex network can be determined by the shape of 𝜏𝜏(𝑞𝑞) or 𝐷𝐷(𝑞𝑞) curves. If 𝜏𝜏(𝑞𝑞) is a straight line or 𝐷𝐷(𝑞𝑞) is a
constant, then the network is monofractal; similarly if 𝐷𝐷(𝑞𝑞) or 𝜏𝜏(𝑞𝑞) have convex shapes, then the network is MF.
• D(q) decreases sharply after q=-4. High densities around the hubs. (interesting feature)
25(a) Plot of mass exponents 𝜏𝜏(𝑞𝑞) as function of q. (b) Plot of generalized fractal dimensions 𝐷𝐷(𝑞𝑞) as function of q. Curves
indicated by circles represent numerical estimations of the mass exponents and generalized fractal dimensions, respectively.
few companies have the role of hubs,
while the rest are just small participants
• Communities: Networks have sections in which the nodes are more densely connected to each other than to
the rest of the nodes in the networks. Graph partioning process.
• Locating communities allows an easier study/understanding of the network, and provides insights revealing
relevant groups of nodes, creating meaningful classifications, discovering similarities, etc.
• I studied the overlapping community structures by using the CPM. Output: features for predictive analytics,
or targeted campaigns, segmentation models.
Visual representation of a section of the overlapping network of communities (Estonian network of payments). The circles (nodes) represent communities and the
black lines between them represent shared nodes between communities.
Cumulative distribution of community degrees d. Cumulative distribution function of the membership
Cumulative distribution function of the overlap size 𝑠𝑠𝑜𝑜.Cumulative community size distribution at different times t.
𝑚𝑚𝑖𝑖 − # comms to where the
node 𝑖𝑖 belongs to
Max 𝑚𝑚 =10 (a company can
belong to maximum of 10
nodes that belong to many
different comms is quite
small, while nodes
belonging to at least 1 is
Prob of a comm to have a
size higher or equal to s.
Scaling tail is higher as t
Many small comms
coexisting with few large
Network of overlapping comms:
Comms degree: nbr of links
Community degrees in the end of
the tail: biggest customers
Central part decays faster
Observable curvature in log-log
plot. No approximation method
fitted the distribution. K max=63
The range in which comms
overlap with each other .
The overlap size = # of
nodes that 2 comms
𝑃𝑃(𝑠𝑠𝑜𝑜) proportion of
overlaps larger than 𝑠𝑠𝑜𝑜.
The largest overlap size is
22, at s_o≥ 9 the #of
• Community structure: investigate if the similarities in communities’ features amongst different
complex networks arise randomly or if there are any unknown properties shared by all of
• Predicting changes in a payment network through community detection analysis. Further
applications: strengthening relationships between companies of the same community to
improve the performance of the whole network, targeted marketing, identification of patterns
between companies and tracking of suspicious activities.
• Multifractality: Potential factors that drive the strength of the multifractal spectrum. Some
applications: Studying the origin of such factors.
• Patterns and the changes of the multifractal spectrum during financial crisis periods for risk
pattern recognition purposes, using different probability measures
• Building network models to forecast country money flows or potential industry growth trends
based on transactions.
Current work in progress at the bank: data
science topics, AI, Machine learning…
• Propensity models- machine learning
• Communities networks – identifying customers with similar needs (clustering)
• Chat box: NLP, text analysis.
• Payment classificators
• Recommender systems
• Cash flow predictions – tensor flow
• Tools: open software- scala, python, pyspark, etc…
List of publications
• Rendón de la Torre S., Kalda J., Kitt R., Engelbrecht J. (2016). On the topologic structure of economic
complex networks: Empirical evidence from large scale payment network of Estonia. Chaos, Solitons &
Fractals, 90, 18−27 DOI:10.1016/j.chaos.2016.01.018.
• Rendón de la Torre S., Kalda J., Kitt R., Engelbrecht J. (2017). Fractal and multifractal analysis of complex
networks: Estonian network of payments. The European Physical Journal B, 90. DOI: 10.1140/epjb/e2017-
• Rendón de la Torre S., Kalda J. (2018) Review of structures and dynamics of economic complex networks:
Large scale payment network of Estonia. In: Zengqjang C., Dehmer M., Emmert-Streib F., Shi Y. (eds.),
Modern and interdisciplinary problems in network science. Taylor & Francis CRC Group, USA, 193-226
• Rendón de la Torre S., Kalda J., Kitt R., Engelbrecht J. (2019) Detecting overlapping community structure:
Estonian network of payments. Proceedings of the Estonian Academy of Sciences, 68(1) 79-88.
• Rendón de la Torre S., Kalda J., Kitt R. (2019) Specific statistical properties of the strength of links and
nodes of the Estonian network of payments. Proceedings of the Estonian Academy of Sciences. Manuscript