Your SlideShare is downloading.
×

- 1. Center for Financial Studies at the Goethe University PhD Mini-course Frankfurt, 25 January 2013 Financial Networks VI. Correlation Networks Dr. Kimmo Soramäki Founder and CEO FNA, www.fna.fi
- 2. Agenda V. Inferring Links • Prices and returns • Controlling for common factors • Correlation and dependence • Significant correlations • Multiple Comparisons VI. Correlation Networks • Distance and Hierarchical Clustering • Minimum Spanning Tree & PMFG • Other filtering • Layout algorithms 2
- 3. Hierarchical structure in financial markets 3
- 4. Minimum Spanning Tree A spanning tree of a graph is a subgraph that: 1. is a tree and 2. connects all the nodes together Length of a tree is the sum of its links. Minimum spanning tree (MST) is a spanning tree with shortest length. MST reflects the hierarchical structure of the correlation matrix
- 5. MST and Hierarchical Structure Source: R.N. Mantegna (1999). Hierarchical structure in nancial markets, Eur. Phys. J. B 11, 193-197 5
- 6. 36 Single Linkage Clustering • A method for hierarchical clustering • Clusters based on similarity or distance • SLINK algorithm R. Sibson (1973). SLINK: an optimally efficient algorithm for the single-link cluster method. The Computer Journal (British Computer Society) 16 (1): 30–34. 6
- 7. Example # build network from correlations buildbycorrelationd -file daxreturns-2011-recon.csv -missing Alert -preserve false # calculate distance corrdistance -p correlation -method gower # calculate single linkage clistering slink -p corrdistance # create heatmaps heatmap -sortv vertex_id -p correlation -symmetric true -cellsizedefault 13 - transition 0 -cellhover correlation -palette darkblue-lightgray-darkred - colordomain (-1)-1 -saveas daxheat-slink-Y 7
- 8. Unordered, Principal Ordered by Cluster, Principal Component Removed Component Removed 8
- 9. Radial tree -layout • Calculates coordinates for radial layout as presented in Bachmaier, Brandes and Schlieper (2005) • The layout allows definition of each arc length • Specific parameters of command radialtreeviz: – Arc length property (-p) : Arc property defining arc length. Optional. – Root vertex (-rootvertex) : Id of root vertex. The root vertex is placed in the middle of the screen. Due to the repositioning of the tree, nodes may be placed outside the canvas in other than the first network. Optional. – Optimal rotation (-rotation) : Rotates layout to minimize sum of vertex distances between subsequent networks. Optional. By default 'false'. – Scaling (-scale) : Scale of visualization: value/pixel. Christian Bachmaier, Ulrik Brandes, and Barbara Schlieper (2005). Drawing Phylogenetic Trees. Department of Computer & Information Science, University of 9 Konstanz, Germany
- 10. Putting it all together # build network from correlations buildbycorrelationd -file daxreturns-2011.csv -missing Alert -savestdev -savereturns -preserve false # calculate distance corrdistance -p correlation -method gower # calculate single linkage clistering minst -p corrdistance # drop arcs not in MST dropa -e minst=false # calculate absolute correlation calcap -e 1-abs(correlation) -saveas vizdistance # create heatmaps radialtreeviz -p vizdistance -vlabel vertex_id -vsize stdev -transition 3000 -ahover correlation -saveas daxviz-MST 10
- 11. Asset Trees Size of node reflects volatility (variance) of returns Links between nodes reflect 'backbone' correlations - short link = high correlation - long link = low correlation 11
- 12. Circle Tree -visualization • Calculates coordinates for circle tree layout as presented in Bachmaier, Brandes and Schlieper (2005) • As before but instead of radialtreeviz: circletreeviz -vlabel vertex_id -vsize stdev -transition 3000 -ahover correlation -saveas daxviz-MST-circle 12
- 13. Planar Maximally Filtered Graph Node size scales with degree • A complex graph with loops and cliques of up to 4 elements. It can be drawn on a planar surface without link crossings. • MST is contained in PMFG M. Tumminello, T. Ast, T. Di Matteo and R. N. Mantegna (2005). A Tool for Filtering Information in Complex Systems. PNAS vol. 102 no. 30 pp. 10421–10426 13
- 14. PMFG -command # build network from correlations buildbycorrelationd -file daxreturns-2011.csv -missing Alert -savestdev -savereturns -preserve false # calculate distance corrdistance -p correlation -method gower # calculate single linkage clistering pmfg -p corrdistance # drop arcs not in MST dropa -e pmfg=false # calculate 1-absolute correlation calcap -e abs(correlation) -saveas vizdistance # calculate degree degree # create heatmaps frviz -vlabel vertex_id -vsize stdev -atransparency vizdistance -ahover correlation -transition 3000 -ahover correlation - arrows false -saveas daxviz-PMFG 14
- 15. Partial Correlation • Measures the degree of association between two random variables • What is the direct relationship between Adidas and Allianz, controlling for BASF, BAYER, ... ? • We build regression models for Adidas and Allianz and look at the correlation of their model residuals (i.e. wgat left unexplained by the other factors) -> Partial correltation 15
- 16. Example # build network from correlations buildbypartialcorrelationd -file daxreturns-2011.csv -missing Alert - savestdev -preserve false # show as heatmap heatmap -sortv vertex_id -p partial_correlation -symmetric true - cellsizedefault 13 -transition 0 -cellhover partial_correlation -palette darkblue-lightgray-darkred -colordomain (-1)-1 -saveas daxheat- partial-Y 16
- 17. Correlations Partial Correlations 17
- 18. NETS • Network Estimation for Time- Series • Forthcoming paper by Barigozzi and Brownlees • Estimates an unknown network structure from multivariate data • Captures both comtemporenous and serial dependence (partial correlations and lead/lag effects) 18
- 19. Correlation filtering PMFG Balance between too much and too little information One of many methods to create networks from correlation/distance matrices – PMFGs, Partial Correlation Networks, Influence Networks, Granger Influence Network Causality, NETS, etc. New graph, information-theory, economics & statistics -based models are being actively developed 19
- 20. Sammon’s Projection Proposed by John W. Sammon in IEEE Transactions on Computers 18: 401–409 (1969) A nonlinear projection method to map a high dimensional space onto a space of lower dimensionality. Example: Iris Setosa Iris Versicolor Iris Virginica 20
- 21. Example # build network from correlations buildbycorrelationd -file daxreturns-2011.csv -missing Alert -savestdev -savereturns - preserve false # calculate distance corrdistance -p correlation -method gower # Calculate sammonlayout sammonlayouta -p corrdistance -saveerror true # Sum up error sumaforv -p error -saveas error # create heatmaps sammonaviz -p corrdistance -vlabel vertex_id -vsize error -transition 3000 -ahover error -saveas daxviz-Sammon-Y 21
- 22. Node size reflects error in layout
- 23. Tutorials • Tutorial 1 – Loading Networks into FNA • Tutorial 2 – Managing Data in FNA • Tutorial 3 – Network Summary Measures • Tutorial 4 – Centrality Measures • Tutorial 5 – Connectedness and Components • Tutorial 6 – Network Visualization • Tutorial 7 – Correlation Networks • Tutorial 8 – Payment System Simulations • Tutorial 9 – Analyzing Cross-Border Banking Exposures 23
- 24. Blog, Library and Demos at www.fna.fi Dr. Kimmo Soramäki kimmo@soramaki.net Twitter: soramaki