Often centrality measures are used in social network analysis. The goal of this presentation is to explain how different centrality works and how they can be compared.
Centrality measures covered: degree, closeness, harmonic, Lin's index, betweenness, eigenvector, seeley's index, pagerank, hits, SALSA
This summary provides an overview of centrality measures in social network analysis:
1) There are several approaches to measuring centrality in a network, including degree centrality, closeness centrality, betweenness centrality, and eigenvector centrality. These measures capture different aspects of a node's importance or influence.
2) Degree centrality focuses on the number of ties a node has, closeness looks at distance to all other nodes, and betweenness considers dependency on shortest paths. Eigenvector centrality captures the influence of important neighbors.
3) Comparing centrality measures can provide insight into a network's structure and a node's role, such as brokering connections vs. being well-connected locally.
Community detection algorithms are used to identify densely connected groups of nodes in networks. Modularity optimization is commonly used, which detects communities as groups of nodes with more connections within groups than expected by chance. Parameters like resolution affect results. Multilayer networks model systems with multiple network layers over nodes. Multilayer modularity generalizes modularity to multilayer networks. Community detection in multilayer networks provides insights into structures across data types and applications.
This document provides an overview of social network analysis (SNA) including concepts, methods, and applications. It begins with background on how SNA originated from social science and network analysis/graph theory. Key concepts discussed include representing social networks as graphs, identifying strong and weak ties, central nodes, and network cohesion. Practical applications of SNA are also outlined, such as in business, law enforcement, and social media sites. The document concludes by recommending when and why to use SNA.
This document discusses complex network analysis and several concepts related to social networks such as the small world phenomenon, friendship paradox, centrality measures, clustering coefficient, and degree distribution. It provides examples of applying complex network analysis to a friendship network of BITS students and a Twitter growth model. Power law distributions are found to describe properties of many real-world networks like degree distributions.
Social network analysis [SNA] is the mapping and measuring of relationships and flows between people, groups, organizations, computers, URLs, and other connected information/knowledge entities. SNA provides both a visual and a mathematical analysis of human relationships.
Community Detection in Social Networks: A Brief OverviewSatyaki Sikdar
The document provides an overview of community detection in social networks. It discusses that networks are found everywhere where there are interactions between actors. It then motivates the importance of detecting communities by explaining that communities are groups of nodes that likely share properties and roles. Detecting communities has applications like improving recommendation systems and parallel computing. It also justifies the existence of communities in real networks using the concept of homophily where similar actors tend to connect. The document then discusses different approaches to detecting communities including Girvan-Newman algorithm based on edge betweenness and Louvain method which uses greedy modularity optimization.
The document discusses concepts in social network analysis including measuring networks through embedding measures and positions/roles of nodes. It covers network measures such as reciprocity, transitivity, clustering, density, and the E-I index. It also discusses positions like structural equivalence and regular equivalence and how to compute positional similarity through adjacency matrices.
This summary provides an overview of centrality measures in social network analysis:
1) There are several approaches to measuring centrality in a network, including degree centrality, closeness centrality, betweenness centrality, and eigenvector centrality. These measures capture different aspects of a node's importance or influence.
2) Degree centrality focuses on the number of ties a node has, closeness looks at distance to all other nodes, and betweenness considers dependency on shortest paths. Eigenvector centrality captures the influence of important neighbors.
3) Comparing centrality measures can provide insight into a network's structure and a node's role, such as brokering connections vs. being well-connected locally.
Community detection algorithms are used to identify densely connected groups of nodes in networks. Modularity optimization is commonly used, which detects communities as groups of nodes with more connections within groups than expected by chance. Parameters like resolution affect results. Multilayer networks model systems with multiple network layers over nodes. Multilayer modularity generalizes modularity to multilayer networks. Community detection in multilayer networks provides insights into structures across data types and applications.
This document provides an overview of social network analysis (SNA) including concepts, methods, and applications. It begins with background on how SNA originated from social science and network analysis/graph theory. Key concepts discussed include representing social networks as graphs, identifying strong and weak ties, central nodes, and network cohesion. Practical applications of SNA are also outlined, such as in business, law enforcement, and social media sites. The document concludes by recommending when and why to use SNA.
This document discusses complex network analysis and several concepts related to social networks such as the small world phenomenon, friendship paradox, centrality measures, clustering coefficient, and degree distribution. It provides examples of applying complex network analysis to a friendship network of BITS students and a Twitter growth model. Power law distributions are found to describe properties of many real-world networks like degree distributions.
Social network analysis [SNA] is the mapping and measuring of relationships and flows between people, groups, organizations, computers, URLs, and other connected information/knowledge entities. SNA provides both a visual and a mathematical analysis of human relationships.
Community Detection in Social Networks: A Brief OverviewSatyaki Sikdar
The document provides an overview of community detection in social networks. It discusses that networks are found everywhere where there are interactions between actors. It then motivates the importance of detecting communities by explaining that communities are groups of nodes that likely share properties and roles. Detecting communities has applications like improving recommendation systems and parallel computing. It also justifies the existence of communities in real networks using the concept of homophily where similar actors tend to connect. The document then discusses different approaches to detecting communities including Girvan-Newman algorithm based on edge betweenness and Louvain method which uses greedy modularity optimization.
The document discusses concepts in social network analysis including measuring networks through embedding measures and positions/roles of nodes. It covers network measures such as reciprocity, transitivity, clustering, density, and the E-I index. It also discusses positions like structural equivalence and regular equivalence and how to compute positional similarity through adjacency matrices.
This document discusses data mining in social networks. It covers topics like social network analysis, graph mining, and text mining on social media platforms. Graph mining is used to understand relationships and extract communities from social networks. Text mining techniques like clustering and anomaly detection are applied to textual data from blogs, messages, etc. on social platforms. The document also discusses accessing Facebook data through its API and SDK, and applications and limitations of social network analysis.
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Xiaohan Zeng
This document provides an overview of social network analysis, including what social networks are, what can be learned from analyzing social networks, and how social network analysis can be performed. Some key findings that can be uncovered include the six degrees of separation principle, the 80-20 rule of social popularity where a minority of nodes have most connections, how to identify influential nodes, and how to group similar nodes into communities. Various metrics and models are described for analyzing features like path lengths, degree distributions, ranking nodes, measuring community structure, and more. Examples of social network analysis are also provided.
This document discusses link analysis and PageRank, an algorithm for identifying important nodes in large network graphs. It begins with an overview of graph data structures and the goal of identifying influential nodes. It then introduces PageRank, explaining its basic assumptions and showing examples of how it calculates node importance scores. The document discusses problems with the initial PageRank approach and how it was improved with the Complete PageRank algorithm. Finally, it briefly introduces Topic-sensitive PageRank, which aims to identify important nodes related to specific topics.
1. Basics of Social Networks
2. Real-world problem
3. How to construct graph from real-world problem?
4. What graph theory problem getting from real-world problem?
5. Graph type of Social Networks
6. Special properties in social graph
7. How to find communities and groups in social networks? (Algorithms)
8. How to interpret graph solution back to real-world problem?
Graph theory concepts like centrality, clustering, and node-edge diagrams are used to analyze social networks. Visualization techniques include matrix representations and node-link diagrams, each with advantages. Hybrid representations combine these to leverage their strengths. MatrixExplorer allows interactive exploration of social networks using both matrix and node-link views.
This document discusses community detection in social media and online networks. It defines communities as groups of densely interconnected nodes in a graph. It outlines various algorithms for detecting communities, including graph partitioning, k-clique detection, core decomposition, divisive algorithms based on edge centrality, and modularity maximization approaches. It also discusses local community detection methods and evaluation of community detection results.
Social Network Analysis Workshop
This talk will be a workshop featuring an overview of basic theory and methods for social network analysis and an introduction to igraph. The first half of the talk will be a discussion of the concepts and the second half will feature code examples and demonstrations.
Igraph is a package in R, Python, and C++ that supports social network analysis and network data visualization.
Ian McCulloh holds joint appointments as a Parson’s Fellow in the Bloomberg School of Public health, a Senior Lecturer in the Whiting School of Engineering and a senior scientist at the Applied Physics Lab, at Johns Hopkins University. His current research is focused on strategic influence in online networks. His most recent papers have been focused on the neuroscience of persuasion and measuring influence in online social media firestorms. He is the author of “Social Network Analysis with Applications” (Wiley: 2013), “Networks Over Time” (Oxford: forthcoming) and has published 48 peer-reviewed papers, primarily in the area of social network analysis. His current applied work is focused on educating soldiers and marines in advanced methods for open source research and data science leadership.
More information about Dr. Ian McCulloh's work can be found at https://ep.jhu.edu/about-us/faculty-directory/1511-ian-mcculloh
The document discusses various model-based clustering techniques for handling high-dimensional data, including expectation-maximization, conceptual clustering using COBWEB, self-organizing maps, subspace clustering with CLIQUE and PROCLUS, and frequent pattern-based clustering. It provides details on the methodology and assumptions of each technique.
Social Network Analysis power point presentation Ratnesh Shah
Basics of social network analysis,Application and also explain interesting study done by facebook , twitter, youtube and many more social media network ,slide contains some of interesting study to get knowledge about online social network analysis.
A fast-paced introduction to Deep Learning concepts, such as activation functions, cost functions, back propagation, and then a quick dive into CNNs. Basic knowledge of vectors, matrices, and derivatives is helpful in order to derive the maximum benefit from this session.
Quick introduction to community detection.
Structural properties of real world networks, definition of "communities", fundamental techniques and evaluation measures.
This document provides an overview of the introductory lecture to the BS in Data Science program. It discusses key topics that were covered in the lecture, including recommended books and chapters to be covered. It provides a brief introduction to key terminologies in data science, such as different data types, scales of measurement, and basic concepts. It also discusses the current landscape of data science, including the difference between roles of data scientists in academia versus industry.
Slides for a talk about Graph Neural Networks architectures, overview taken from very good paper by Zonghan Wu et al. (https://arxiv.org/pdf/1901.00596.pdf)
Hierarchical clustering methods group data points into a hierarchy of clusters based on their distance or similarity. There are two main approaches: agglomerative, which starts with each point as a separate cluster and merges them; and divisive, which starts with all points in one cluster and splits them. AGNES and DIANA are common agglomerative and divisive algorithms. Hierarchical clustering represents the hierarchy as a dendrogram tree structure and allows exploring data at different granularities of clusters.
Graph mining analyzes structured data like social networks and the web through graph search algorithms. It aims to find frequent subgraphs using Apriori-based or pattern growth approaches. Social networks exhibit characteristics like densification and heavy-tailed degree distributions. Link mining analyzes heterogeneous, multi-relational social network data through tasks like link prediction and group detection, facing challenges of logical vs statistical dependencies and collective classification. Multi-relational data mining searches for patterns across multiple database tables, including multi-relational clustering that utilizes information across relations.
This document discusses unsupervised machine learning classification through clustering. It defines clustering as the process of grouping similar items together, with high intra-cluster similarity and low inter-cluster similarity. The document outlines common clustering algorithms like K-means and hierarchical clustering, and describes how K-means works by assigning points to centroids and iteratively updating centroids. It also discusses applications of clustering in domains like marketing, astronomy, genomics and more.
Network measures used in social network analysis Dragan Gasevic
Definition of measures (diameter, density, degree centrality, in-degree centrality, out-degree centrality, betweenness centrality, closeness centrality) used in social network analysis. The presentation is prepared by Dragan Gasevic for DALMOOC.
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...Anmol Bhasin
The document summarizes a presentation on people recommender systems and social networks. It discusses key concepts in social recommenders like reciprocity and multiple objectives. It provides examples of recommender systems at LinkedIn including People You May Know, talent matching, and endorsements. It also covers special topics like intent understanding using techniques like survival analysis, and evaluation challenges for social recommenders.
This document provides an overview of social network analysis (SNA). SNA is not just a methodology but a unique perspective that focuses on relationships between individuals, groups, and institutions rather than individuals alone or macro social structures. Practical applications of SNA include improving communication in organizations, identifying criminal networks, and recommending friends in social networks. The document outlines key SNA concepts like representing networks as graphs, identifying strong/weak ties, central nodes, and measures of network structure such as cohesion, density, and clustering.
This deck briefly outlines the work we did mapping the South African Twittersphere for the 2012 SAMRA conference, including some analyses we did based on the structure of the network. Specifically, we identified people with the potential for influence based on their betweeness centrality and Authority (HITS). In addition, we also used a modularity algorithm to identify 5 clearly distinct communities within the graph. The results are for interest-sake only and should be interpreted within the limitations of the data."
This document discusses data mining in social networks. It covers topics like social network analysis, graph mining, and text mining on social media platforms. Graph mining is used to understand relationships and extract communities from social networks. Text mining techniques like clustering and anomaly detection are applied to textual data from blogs, messages, etc. on social platforms. The document also discusses accessing Facebook data through its API and SDK, and applications and limitations of social network analysis.
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...Xiaohan Zeng
This document provides an overview of social network analysis, including what social networks are, what can be learned from analyzing social networks, and how social network analysis can be performed. Some key findings that can be uncovered include the six degrees of separation principle, the 80-20 rule of social popularity where a minority of nodes have most connections, how to identify influential nodes, and how to group similar nodes into communities. Various metrics and models are described for analyzing features like path lengths, degree distributions, ranking nodes, measuring community structure, and more. Examples of social network analysis are also provided.
This document discusses link analysis and PageRank, an algorithm for identifying important nodes in large network graphs. It begins with an overview of graph data structures and the goal of identifying influential nodes. It then introduces PageRank, explaining its basic assumptions and showing examples of how it calculates node importance scores. The document discusses problems with the initial PageRank approach and how it was improved with the Complete PageRank algorithm. Finally, it briefly introduces Topic-sensitive PageRank, which aims to identify important nodes related to specific topics.
1. Basics of Social Networks
2. Real-world problem
3. How to construct graph from real-world problem?
4. What graph theory problem getting from real-world problem?
5. Graph type of Social Networks
6. Special properties in social graph
7. How to find communities and groups in social networks? (Algorithms)
8. How to interpret graph solution back to real-world problem?
Graph theory concepts like centrality, clustering, and node-edge diagrams are used to analyze social networks. Visualization techniques include matrix representations and node-link diagrams, each with advantages. Hybrid representations combine these to leverage their strengths. MatrixExplorer allows interactive exploration of social networks using both matrix and node-link views.
This document discusses community detection in social media and online networks. It defines communities as groups of densely interconnected nodes in a graph. It outlines various algorithms for detecting communities, including graph partitioning, k-clique detection, core decomposition, divisive algorithms based on edge centrality, and modularity maximization approaches. It also discusses local community detection methods and evaluation of community detection results.
Social Network Analysis Workshop
This talk will be a workshop featuring an overview of basic theory and methods for social network analysis and an introduction to igraph. The first half of the talk will be a discussion of the concepts and the second half will feature code examples and demonstrations.
Igraph is a package in R, Python, and C++ that supports social network analysis and network data visualization.
Ian McCulloh holds joint appointments as a Parson’s Fellow in the Bloomberg School of Public health, a Senior Lecturer in the Whiting School of Engineering and a senior scientist at the Applied Physics Lab, at Johns Hopkins University. His current research is focused on strategic influence in online networks. His most recent papers have been focused on the neuroscience of persuasion and measuring influence in online social media firestorms. He is the author of “Social Network Analysis with Applications” (Wiley: 2013), “Networks Over Time” (Oxford: forthcoming) and has published 48 peer-reviewed papers, primarily in the area of social network analysis. His current applied work is focused on educating soldiers and marines in advanced methods for open source research and data science leadership.
More information about Dr. Ian McCulloh's work can be found at https://ep.jhu.edu/about-us/faculty-directory/1511-ian-mcculloh
The document discusses various model-based clustering techniques for handling high-dimensional data, including expectation-maximization, conceptual clustering using COBWEB, self-organizing maps, subspace clustering with CLIQUE and PROCLUS, and frequent pattern-based clustering. It provides details on the methodology and assumptions of each technique.
Social Network Analysis power point presentation Ratnesh Shah
Basics of social network analysis,Application and also explain interesting study done by facebook , twitter, youtube and many more social media network ,slide contains some of interesting study to get knowledge about online social network analysis.
A fast-paced introduction to Deep Learning concepts, such as activation functions, cost functions, back propagation, and then a quick dive into CNNs. Basic knowledge of vectors, matrices, and derivatives is helpful in order to derive the maximum benefit from this session.
Quick introduction to community detection.
Structural properties of real world networks, definition of "communities", fundamental techniques and evaluation measures.
This document provides an overview of the introductory lecture to the BS in Data Science program. It discusses key topics that were covered in the lecture, including recommended books and chapters to be covered. It provides a brief introduction to key terminologies in data science, such as different data types, scales of measurement, and basic concepts. It also discusses the current landscape of data science, including the difference between roles of data scientists in academia versus industry.
Slides for a talk about Graph Neural Networks architectures, overview taken from very good paper by Zonghan Wu et al. (https://arxiv.org/pdf/1901.00596.pdf)
Hierarchical clustering methods group data points into a hierarchy of clusters based on their distance or similarity. There are two main approaches: agglomerative, which starts with each point as a separate cluster and merges them; and divisive, which starts with all points in one cluster and splits them. AGNES and DIANA are common agglomerative and divisive algorithms. Hierarchical clustering represents the hierarchy as a dendrogram tree structure and allows exploring data at different granularities of clusters.
Graph mining analyzes structured data like social networks and the web through graph search algorithms. It aims to find frequent subgraphs using Apriori-based or pattern growth approaches. Social networks exhibit characteristics like densification and heavy-tailed degree distributions. Link mining analyzes heterogeneous, multi-relational social network data through tasks like link prediction and group detection, facing challenges of logical vs statistical dependencies and collective classification. Multi-relational data mining searches for patterns across multiple database tables, including multi-relational clustering that utilizes information across relations.
This document discusses unsupervised machine learning classification through clustering. It defines clustering as the process of grouping similar items together, with high intra-cluster similarity and low inter-cluster similarity. The document outlines common clustering algorithms like K-means and hierarchical clustering, and describes how K-means works by assigning points to centroids and iteratively updating centroids. It also discusses applications of clustering in domains like marketing, astronomy, genomics and more.
Network measures used in social network analysis Dragan Gasevic
Definition of measures (diameter, density, degree centrality, in-degree centrality, out-degree centrality, betweenness centrality, closeness centrality) used in social network analysis. The presentation is prepared by Dragan Gasevic for DALMOOC.
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...Anmol Bhasin
The document summarizes a presentation on people recommender systems and social networks. It discusses key concepts in social recommenders like reciprocity and multiple objectives. It provides examples of recommender systems at LinkedIn including People You May Know, talent matching, and endorsements. It also covers special topics like intent understanding using techniques like survival analysis, and evaluation challenges for social recommenders.
This document provides an overview of social network analysis (SNA). SNA is not just a methodology but a unique perspective that focuses on relationships between individuals, groups, and institutions rather than individuals alone or macro social structures. Practical applications of SNA include improving communication in organizations, identifying criminal networks, and recommending friends in social networks. The document outlines key SNA concepts like representing networks as graphs, identifying strong/weak ties, central nodes, and measures of network structure such as cohesion, density, and clustering.
This deck briefly outlines the work we did mapping the South African Twittersphere for the 2012 SAMRA conference, including some analyses we did based on the structure of the network. Specifically, we identified people with the potential for influence based on their betweeness centrality and Authority (HITS). In addition, we also used a modularity algorithm to identify 5 clearly distinct communities within the graph. The results are for interest-sake only and should be interpreted within the limitations of the data."
GraphDice: A System for Exploring Multivariate Social NetworksNiklas Elmqvist
This document describes GraphDice, a system for exploring multivariate social networks. GraphDice allows users to visualize social networks, with nodes representing individuals and edges representing relationships. It integrates network topology, node and edge attributes, and tabular data views. GraphDice is designed for social network analysts to consistently represent and interact with network data through features like dynamic queries, selection history, and coordinated visualizations and data tables.
This document provides an introduction to networks, including basic definitions and concepts. It defines what a network is composed of, including vertices (nodes) and edges (links). It discusses directed vs undirected networks, different ways to represent networks, degree properties, connected components, and several centrality measures - including degree, closeness, betweenness, and eigenvector centrality - to determine important nodes within a network. It provides examples and explanations for understanding these fundamental network concepts.
UNIT I- INTRODUCTION
Introduction to Web - Limitations of current Web – Development of Semantic Web – Emergence of the Social Web – Statistical Properties of Social Networks -Network analysis - Development of Social Network Analysis - Key concepts and measures in network analysis - Discussion networks -Blogs and online communities - Web-based networks
A network based model for predicting a hashtag break out in twitter Sultan Alzahrani
Online information propagates differently on the web, some
of which can be viral. In this paper, first we introduce a simple standard deviation sigma levels based Tweet volume breakout definition, then we proceed to determine patterns of re-tweet network measures to predict whether a hashtag volume will breakout or not. We also developed a visualization tool to help trace the evolution of hashtag volumes, their underlying networks and both local and global network measures. We trained a random forest tree classifier to identify effective network measures for predicting hashtag volume breakouts. Our experiments showed that “local” network features, based on a fixed-sized sliding window, have an overall predictive accuracy of 76 %, where as, when we incorporate “global” features that utilize all interactions up to the current period, then the overall predictive accuracy of a sliding window based breakout predictor jumps to 83 %.
This document discusses community detection in networks. It begins by emphasizing the importance of defining what constitutes a community based on the goals and data of the specific network being analyzed. It then briefly describes four common community detection techniques: hierarchical clustering, k-means clustering, spectral clustering, and modularity maximization. Hierarchical and k-means clustering partition networks based on node similarity, while spectral clustering and modularity maximization detect communities as groups of densely connected nodes.
This isn't what I thought it was: community in the network ageNancy Wright White
A narrated version can be found here: https://www.youtube.com/watch?v=YB82kbj-NXw This was a short remote presentation that was part of a panel at the CACUSS 12.0: Engaging Digital Citizens conference <http: /> in Vancouver BC, Canada.
The Network, the Community and the Self-CreativityVince Cammarata
Lulu.com is a marketplace where “authors” - individuals, companies
and groups - can publish and sell a variety of digital content including
books, music, video, software, calendars, photos and artwork...
UNIT 1: INTRODUCTION
Introduction to Web - Limitations of current Web – Development of Semantic Web – Emergence of the Social Web – Statistical Properties of Social Networks -Network analysis - Development of Social Network Analysis - Key concepts and measures in network analysis - Discussion networks -Blogs and online communities - Web-based networks
Community detection from research papers (AAN dataset) using the algorithms:
K-Means
Louvain
Newman-Girvan
github link to code: https://goo.gl/CXej44
github link to project web page: http://goo.gl/7OOkhI
youtube link to video:https://goo.gl/SCpamf
dropbox link to ppt report video: https://goo.gl/cgACzU
Detecting Community Structures in Social Networks by Graph SparsificationSatyaki Sikdar
The document discusses detecting community structures in social networks through graph sparsification. It defines communities as covers of the nodes in a graph that are either disjoint or overlapping. It describes the Girvan-Newman algorithm for detecting disjoint communities, which works by repeatedly removing the edge with the highest betweenness centrality until individual communities are revealed. The algorithm aims to preserve community structure when sparsifying large networks to enable faster community detection.
This document provides an overview of network sampling techniques, community detection algorithms, and network models. It discusses random sampling, stratified sampling, and cluster sampling for network data. For community detection, it describes algorithms based on cliques, k-cliques, k-clubs, quasi-cliques, modularity maximization, and spectral clustering. It also introduces models for small-world networks including Watts-Strogatz model and discusses how real-world networks exhibit small-world properties.
1. The document describes a social network analysis of Jose Rizal's life and works conducted by Jose A. Fadul using NodeXL software.
2. The analysis visualized Rizal's social network through nodes and edges to represent individuals and their relationships. It analyzed Rizal's network by clusters and degrees of centrality.
3. The analysis also examined how Rizal's works and other key events and individuals in his life were interconnected through the social network approach.
Community detection aims to identify groups of nodes in a network that are more densely connected internally than to the rest of the network. It can reveal properties of networks without privacy risks. While similar to clustering, community detection methods consider graph properties directly due to challenges from network data. Two recent methods are discussed - one based on shortest path betweenness to iteratively remove inter-community edges, and another based on optimizing modularity, a measure of community structure quality. Modularity can be computed using the eigenvectors of the modularity matrix.
A deck presented at the MRS 'Maximising the Value of Big Data' conference in London, January 2013.
Presents my view of big data and the potential it gives us for mapping the systems that we deal with on a day-to-day basis. Big data holds the promise of providing us with a meta-view of the systems that we all think we are so familiar with. I think we will find that the woods look nothing like the trees.
Social Network Analysis: applications for education researchChristian Bokhove
What is your talk about?
This seminar will illustrate various social network analysis (SNA) techniques and measures and their applications to research problems in education. These applications will be illustrated from our own research utilising a range of SNA techniques.
What are the key messages of your talk?
We will cover some of the ways in which network data can be collected and utilised with other research data to examine the relationships between network measures and other attributes of individuals and organisations, and how it can be linked to other approaches in multiple methods studies.
What are the implications for practice or research from your talk?
SNA is an approach that draws from theories of social capital to study the relational ties that exist between actors or institutions in a specific context. Such ties might include learning exchanges or advice-seeking interactions. SNA techniques allow researchers to incorporate the interdependence of participants within their research questions, whereas many traditional techniques assume our participants, and their responses to our questions, are independent of one another.
Clustering Methods and Community Detection with NetworkX. A slide deck for the NTU Complexity Science Winter School.
For the accompanying iPython Notebook, visit: http://github.com/eflegara/NetStruc
Social network analysis & Big Data - Telecommunications and moreWael Elrifai
Social Network Analysis: Practical Uses and Implementation is a presentation that discusses social network analysis and its uses. It covers key topics such as defining social networks and social network analysis, why social network analysis is important, identifying influencers in social networks, roles in social networks, graph theory concepts used in social network analysis, calculating metrics from social networks, and recommended approaches to social network analysis. The presentation provides an overview of social network analysis concepts and their practical applications.
Rizal created complex characters in Noli Me Tangere that represented different social statuses during his time. Crisostomo Ibarra symbolized the idealistic youth while Elias represented the common Filipino. Kapitan Tiago portrayed the rich Filipinos who oppressed others. Maria Clara depicted purity and innocence. Padre Damaso was a cruel priest who abused his power, while Padre Sibyla was a more liberal priest. Sisa and her sons Basilio and Crispin personified the suffering of the Filipino people under injustice and oppression.
- Dimensionality reduction techniques assign instances to vectors in a lower-dimensional space while approximately preserving similarity relationships. Principal component analysis (PCA) is a common linear dimensionality reduction technique.
- Kernel PCA performs PCA in a higher-dimensional feature space implicitly defined by a kernel function. This allows PCA to find nonlinear structure in data. Kernel PCA computes the principal components by finding the eigenvectors of the normalized kernel matrix.
- For a new data point, its representation in the lower-dimensional space is given by projecting it onto the principal components in feature space using the kernel trick, without explicitly computing features.
PCA is a dimensionality reduction technique that uses linear transformations to project high-dimensional data onto a lower-dimensional space while retaining as much information as possible. It works by identifying patterns in data and expressing the data in such a way as to highlight their similarities and differences. Specifically, PCA uses linear combinations of the original variables to extract the most important patterns from the data in the form of principal components. The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible.
This document discusses methods for determining clustering tendency in datasets. It describes generating clustered and regularly spaced data using the Neyman-Scott and simple sequential inhibition procedures. Three methods for detecting clustering tendency are outlined: tests based on structural graphs like minimum spanning trees, tests based on nearest neighbor distances like Hopkins and Cox-Lewis tests, and a sparse decomposition technique. The document provides details on how these methods work and their relative performance at detecting different patterns in datasets.
The dynamics of networks enables the function of a variety of systems we rely on every day, from gene regulation and metabolism in the cell to the distribution of electric power and communication of information. Understanding, steering and predicting the function of interacting nonlinear dynamical systems, in particular if they are externally driven out of equilibrium, relies on obtaining and evaluating suitable models, posing at least two major challenges. First, how can we extract key structural system features of networks if only time series data provide information about the dynamics of (some) units? Second, how can we characterize nonlinear responses of nonlinear multi-dimensional systems externally driven by fluctuations, and consequently, predict tipping points at which normal operational states may be lost? Here we report recent progress on nonlinear response theory extended to predict tipping points and on model-free inference of network structural features from observed dynamics.
This document summarizes a research paper that proposes a new method to accelerate the nearest neighbor search step of the k-means clustering algorithm. The k-means algorithm is computationally expensive due to calculating distances between data points and cluster centers. The proposed method uses geometric relationships between data points and centers to reject centers that are unlikely to be the nearest neighbor, without decreasing clustering accuracy. Experimental results showed the method significantly reduced the number of distance computations required.
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKSijcsit
This paper introduces an evolutionary approach to enhance the process of finding central nodes in mobile networks. This can provide essential information and important applications in mobile and social networks. This evolutionary approach considers the dynamics of the network and takes into consideration the central nodes from previous time slots. We also study the applicability of maximal cliques algorithms in mobile social networks and how it can be used to find the central nodes based on the discovered maximal cliques. The experimental results are promising and show a significant enhancement in finding the central nodes.
Multiplex Networks: structure and dynamicsEmanuele Cozzo
This document discusses the formal representation and analysis of multiplex networks. It begins by introducing complex networks science and the concept of abstracting real-world systems as graphs to study structure and interactions. It then defines multiplex networks as networks with multiple types of interactions or relations between nodes that can be represented as multiple layer-graphs. The document provides formal definitions and representations of multiplex networks using concepts like participation graphs, layer-graphs, coupling graphs, and supra-adjacency matrices. It also discusses analyzing and coarse-graining multiplex networks through measures like structural metrics, walks, and quotient graphs.
Histogram-Based Method for Effective Initialization of the K-Means Clustering...Gingles Caroline
This document proposes a histogram-based method for initializing cluster centers in k-means clustering. The method works by recursively finding the most populated histogram bin for each attribute dimension, using the bin centroid as the coordinate for that dimension. This focuses the cluster centers on dense regions of the data distribution. The method is linear in complexity, deterministic, and order-invariant, making it suitable for large datasets where other initialization methods are impractical or unreliable. Experimental results on UCI datasets show it outperforms the commonly used maximin initialization method.
This is the second lecture in the CS 6212 class. Covers asymptotic notation and data structures. Also outlines the coming lectures wherein we will study the various algorithm design techniques.
The document discusses collaborations between 6 students on the topics of data structures trees and graphs. It provides information on binary trees, binary search trees, tree and graph implementations and common graph algorithms like Dijkstra's algorithm. Examples of trees, graphs and Dijkstra's algorithm are shown.
The document discusses weighted nuclear norm minimization and its applications to image denoising. It provides background on key concepts from linear algebra and optimization theory needed to understand the denoising problem, such as convex optimization, affine transformations, singular value decomposition, and eigendecomposition. The objective of denoising is to extract the low-rank original image from a noisy high-dimensional image, modeled as the sum of the original image and white noise.
This document discusses different approaches to identifying clusters or "assemblages" in graph data. It defines assemblages as dense subgraphs with more internal than external connections. Several algorithms are described for finding assemblages, including k-medoids, Newman-Girvan, Louvain, and MCL. Evaluation metrics like modularity and weighted community clustering are also covered. The document aims to explain how to analyze real-world network data to discover meaningful assemblages.
Understanding High-dimensional Networks for Continuous Variables Using ECLHPCC Systems
Syed Rahman & Kshitij Khare, University of Florida, present at the 2016 HPCC Systems Engineering Summit Community Day.
The availability of high dimensional data (or “big data”) has touched almost every field of science and industry. Such data, where the number of variables (features) is often much higher than the number of samples, is now more pervasive than it has ever been. Discovering meaningful relationships between the variables in such data is one of the major challenges that modern day data scientists have to contend with.
The covariance matrix of the variables is the most fundamental quantity that can help us understand the complex multivariate relationships in the data. In addition to estimating the inverse covariance matrix, CSCS can be used to detect the edges in a directed acyclic graph, as opposed to the edges an undirected graph, which CONCORD (presented at the 2015 summit) was used for.
Similar to the CONCORD algorithm, the CSCS algorithm works by minimizing a convex objective function through a cyclic coordinate minimization approach. In addition, it is theoretically guaranteed to converge to a global minimum of the objective function. One of the main advantage of CSCS is that each row can be calculated independently of the other rows, and thus we are able to harness the power of distributed computing.
Syed Rahman
Syed Rahman is a PhD student in the Statistics department at the University of Florida working under the supervision of Dr. Kshitij Khare. He is interested in high-dimensional covariance estimation. In 2015, Syed programmed the CONCORD algorithm in ECL and presented this at the HPCC Systems Engineering Summit.
Kshitij Khare
Kshitij Khare is an Associate Professor of Statistics at the University of Florida. He earned his Ph.D. in Statistics from Stanford University in 2009. He has a variety of interests, which include covariance/network estimation in high-dimensional datasets, and Bayesian inference using Markov chain Monte Carlo methods. One of Dr. Khare's major research focus is development of novel statistical methods and algorithms for "big data" or high-dimensional data.
Prepared as a conference tutorial, MIC-Electrical, Athens, Greece, 5th April 2014, updated and delivered again in Beijing, China, 27 January 2015 to students from Complex Systems Group, CSRC and Dept. of Engineering Physics, Tsinghua University
The document proposes and evaluates two techniques for attention in multi-source sequence-to-sequence learning: flat attention combination and hierarchical attention combination. Both techniques achieved comparable results to existing context vector concatenation approaches on tasks of multimodal translation and automatic post-editing. Hierarchical attention combination performed best on multimodal translation and allows inspecting individual input attentions. The techniques provide a way to model importance of each input sequence.
Similar to Network centrality measures and their effectiveness (20)
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
Current Ms word generated power point presentation covers major details about the micronuclei test. It's significance and assays to conduct it. It is used to detect the micronuclei formation inside the cells of nearly every multicellular organism. It's formation takes place during chromosomal sepration at metaphase.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best.
Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level.
Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
ESPP presentation to EU Waste Water Network, 4th June 2024 “EU policies driving nutrient removal and recycling
and the revised UWWTD (Urban Waste Water Treatment Directive)”
Network centrality measures and their effectiveness
1. centrality measures
Survey and comparisons
Authors: Antonio Esposito
Emanuele Pesce
Supervisors: Prof. Vincenzo Auletta
Ph.D Diodato Ferraioli
Aprile 2015
University of Salerno, deparment of computer science
0
4. centrality of a network
What is a centrality measure?
∙ Given a network, the centrality is a quantitative measure which
aims at reveling the importance of a node
∙ The more a node is centered, the more it is important
∙ Formally, a centrality measure is a real valued function on the
nodes of a graph
What do you mean by center?
∙ There are many intuitive ideas about what a center is, so there are
many different centrality measures
3
5. definition of center
The center of a star is at the same time:
∙ the node with largest degree
∙ the node that is closest to the other nodes
∙ the node through which most shortest paths pass
∙ the node with the largest number of incoming paths
∙ the node that maximize the dominant eigenvector of the graph
matrix
Several centrality indices
∙ Different centrality indices capture different properties of a
network
4
6. centrality: some applications
Centrality is used often for detecting:
∙ how influential a person is in a social network?
∙ how well used a road is in a transportation network?
∙ how important a web page is?
∙ how important a room is in a building?
5
10. geometric measures
The idea
∙ In geometric measures the importance is a function of distances.
∙ A geometric centrality depends on how many nodes exist at every
distance
9
11. geometric measures: indegree centrality
∙ Indegree centrality is defined as the number of incoming arcs of a
node x
Cindegree(x) = d−
(x) (1)
∙ The node with the highest degree is the most important
When to use it?
∙ To identify people whom you can talk to
∙ To identify people whom will do favors for you
10
13. indegree centrality: examples
Indegree centrality can be deceiving because it is a local measure
Indegree centrality doeas not work well for:
∙ detecting nodes that are broker between two groups
∙ predicting if an information reaches a node
12
14. geometric measures: closeness centrality
∙ Closeness centrality of x is defined by:
Ccloseness(x) =
1
∑
d(y,x)<∞
d(y, x)
(2)
∙ Divide it for the max number of nodes (n − 1) to normalize the closeness centrality
∙ Nodes with empty coreachable set have centrality 0
∙ The closer a node is to all others, the more it is important
When to use it?
∙ To identify people whom tend to be very influential person within their local
network
∙ They may often not be public figures, but they are often respected locally
∙ To measure how long it will take to spread information from node x to all other
nodes
13
16. geometric measures: harmonic centrality
∙ Harmonic centrality of x, with the convention ∞−1
= 0 is defined
by:
Charmonic(x) =
1
∑
y̸=x
d(y, x)
(3)
∙ It is correlated to closeness centrality in simple networks, but it
also accounts for nodes y that cannot reach x
When to use it?
∙ The same for the closeness but it can be applied to graphs that
are not connected
15
18. lin’s index
∙ Lin’s index of x
Clin(x) =
|{y | d(y, x) < ∞}|2
∑
d(y,x)<∞
d(y, x)
(4)
∙ As closeness, but here nodes with a larger coreachable set are
more important
A fact
∙ Surprisingly, Lin’s index was ignored in literature, even though it
seems to provide a reasonable solution for detecting centers in
networks
17
19. path-based measures
The idea
∙ Path-based measures exploit not only the existence of shortest
paths but actually take into examination all shortest paths (or all
paths) coming into a node
18
20. path-based measures: betweenness centrality
∙ The intuition behind the betweenness centrality is to measure the
probability that a random shortest path passes though a given
node. Betweenness of x is defined as:
Cbetweenness(x) =
∑
y,z̸=x,αyz̸=0
αyz(x)
αyz
(5)
∙ αyz is the number of shortest paths going from y to z
∙ αyz(x) is the number of shortest paths that pass through x
∙ The higher is the fraction of shortest paths which passes through
a node, the more the node is important
When to use it?
∙ To identify nodes which have a large influence on the transfer of
items through the network
19
23. betweenness and closeness
∙ Betweenness and closeness measures applied to the same
network
∙ The nodes are sized by degree and colored by betweenness
22
24. spectral measures
The idea
∙ In spectral measures the importance is related to the iterated
computation of the left dominant eigenvector of the adjacency
matrix.
∙ In the spectral centrality the importance of a node is given by the
importance of the neighbourhood
∙ The more important are the nodes pointing at you, the more
important you are
23
25. spectral measures
How many of them?
∙ The dominant eigenvector
∙ Seeley’s index
∙ Katz’s index
∙ PageRank
∙ HITS
∙ SALSA
24
26. spectral measures: some useful notation
Given the adjacency matrix A we can compute:
∙ The ℓ1 norm of the matrix ¯A
∙ Each element of the row i is divided by the sum of its elements
∙ The symmetric graph G′
of the given graph G
∙ The transpose of AT
of the adjacency matrix A
∙ The number of k−lenght path from a node i to another node j
∙ Ak
: in such a matrix, each element aij will be the number of paths with
lenght = k from the node i to the node j
25
27. spectral measures: the left dominant eigenvector
Dominant eigenvector
∙ Taking in consideration the left dominant eigenvector means to consider the
incoming edges of a node.
∙ To find out the node’s importance, we perform an iterated computation of:
xt+1
i
=
1
λ
n∑
i=0
A
(t)
ij
(6)
where:
∙ x0
i = 1 ∀ i at step 0
∙ xt
is the score after t iterations
∙ λ is the dominant eigenvalue of the adjacency matrix A
∙ After that, the vector x is normalized and the process iterated until convergence
∙ Each node starts with the same score. Then, in iteration, it receives the sum of the
connected neighbor’s score
26
28. eigenvector centrality: example
In figure 1 there are applications on the same graph of degree and
eigenvector centrality
Figure 1: Degree and eigenvector centrality
27
29. spectral measures: seeley’s index
∙ Why give away all of our importance?
∙ It would have more sense to equally divide our importance among our successors
∙ The process will remains the same, but from an algebric point of view that means
normalizing each row of the adjacency matrix:
xt+1
i
=
1
λ
n∑
i=0
¯A
(t)
ij
(7)
where:
∙ x0
i = 1 ∀ i at step 0
∙ xt
is the score after t iterations
∙ λ is the dominant eigenvalue of the adjacency matrix A
∙ ¯A is the normalized form of the adjacency matrix
∙ Isolated nodes of a non strongly connected graph will have null score over
iterations
28
30. spectral measures: katz’s index
Katz’s index weighs all incoming paths to a node and then compute:
x = 1
∞∑
i=0
βi
Ai
(8)
where:
∙ x is the output’s scores vector
∙ 1 is the weight’s vector (for example all 1)
∙ βi
is an attenuation factor (β < 1
λ )
∙ Ai
contains in the generic element aij the number of i-lenght path
from i to j
29
31. spectral measures: pagerank
PageRank - a little overview
∙ It’s supposed to be how the Google’s search engine works
∙ It is the unique vector p satisfying
p = (1 − α)v(1 − α¯A)−1
∙ where:
∙ α ∈ [0, 1) is a dumping factor
∙ v is a preference vector (a distribution)
∙ ¯A is the ℓ1 normalized adjacency matrix
∙ As shown, PageRank and Katz’s index differ by a constant factor
and the ℓ1 normalization of the adjacency matrix A
30
32. spectral measures: eigenvector and pagerank
In figure 2 there are applications of the same graph of eigenvector
PageRank centrality
Figure 2: Degree and eigenvector centrality
31
33. spectral measures: hits
HITS - a little overview by Kleinberg
∙ The key here is the mutual reinforcement
∙ A node ( such as a page ) is authoritative if it is pointed by many
good hubs
∙ Hubs: pages containing good list of authoritative pages
∙ Then an Hub is good if it points to many authoritative pages
∙ We iteratively compute the:
∙ ai: authoritativeness score ( where a0 = 1)
∙ hi: hubbiness score
as the following:
hi+1 = aiAT
ai+1 = hi+1A
∙ This process converges to the left dominant eigenvector of the
matrix AT
A giving the final score of authoritativeness, called ”HITS”
32
34. spectral measures: salsa
SALSA was ideated by Lempel and Moran
∙ Based on the same mutual reinforcement between
authoritativeness and hubbiness, but ℓ1normalizing the matrices A
and AT
.
∙ Starting value: a0 = 1
∙ hi+1 = ai
¯AT
∙ ai+1 = ai
¯A
∙ Contrarily to HITS there is no need of a large number of iteration
with SALSA
33
35. spectral measures: some applications
∙ Left dominant eigenvector: the idea on which networks structure
analysisis is based
∙ Seeley’s index: feedback’s network
∙ Katz’s index: citations networks
∙ expecially good with direct acyclic graphs (where the basic dominant
eigenvector don’t perform well)
∙ HITS: web page’s citations
∙ Pagerank: Google’s search engine
∙ SALSA: link structure analysis
34
37. axioms for centrality
∙ Boldi and Vigna in 2013 tried to provide a method to evaluate and
compare different centrality measures
∙ They defined three axioms that an index should satisfy to behave
predictably
∙ Size axiom
∙ Density axiom
∙ The score-monotonicity axiom
36
38. axioms for centrality: size axiom
Given a graph Sk,p (figure 3), made by a k − clique and a directed
p − cycle, the size axioms is satisfied if there are threshold values,
of p and k such that:
∙ p > k (if the cycle is very large) the nodes of the cycle are more
important
∙ k > p the nodes of clique are more important
∙ intuitively, for p = k, the nodes of the clique are more important
Figure 3: Graph Sk,p 37
39. axioms for centrality: density axiom
∙ Given a graph Dk,p(figure 4), made by a k − clique and a directed
p − cycle connected by a bidirectional bridge x ↔ y, where x is a
node of the clique and y a node of the cycle.
∙ A centrality measure satisfies the density axiom for k = p, if the
centrality of x is strictly larger than the centrality of y.
Figure 4: Graph Gk,p
38
40. axioms for centrality: the score-monotonicity axiom
∙ A centrality measure satisfies the score-monotonicity axiom if for
every graph G and every pair of node x, y such that x ↛ y, when we
add x → y to G the centrality of y increases.
39
41. axioms for centrality: centrality axioms: comparisons
Figure 5: For each centrality and each axiom, the report whether it is
satisfied
The harmonic centrality satisfies all axioms.
40
42. information retrieval: sanity check
∙ Boldi and Vigna have applied centrality measures on standard
datasets in order to find out the behavior of different indices
∙ There are standard datasets with associated queries and ground
truth about which documents are relevant for every query
∙ Those collections are typically used to compare the merits and the
demerits about retrieval methods
41
43. information retrieval: datasets
Dataset GOV2, tested in two different ways:
∙ with all links: complete dataset
∙ with inter-host link only: links between pages of the same host
are excluted from the graph
Measures of effectiveness chosen:
∙ P@10: precision at 10, fraction of relevant documents retrieved
among the first ten
∙ NDCG@10: discounted cumulative gain at 10, measure the
usefulness, or gain, of a document based on its position in the
result list
42
44. information retrieval: results
For each centrality measure the discounted cumulative and precision at 10, on GOV2
dataset using all links (on the left) and using only inter-host links (on the right).
Figure 6: All links Figure 7: Inter-host links 43
46. conclusions
∙ A very simple measure as harmonic centrality, turned out to be a
good notion of centrality.
∙ it satisfies all centrality axioms proposed
∙ it works well to retrieve information
Choose the right measure
∙ No centrality measure is better than the others in every situation
∙ Some are better than others to reach a particular goal, but it
depends on the specific application domain
∙ So, the best approach is to understand which measure fits the
problem better
45
47. references and useful resources
Paolo Boldi and Sebastiano Vigna
Axioms for centrality.
Nicola Perra and Santo Fortunato
Spectral centrality measures in complex networks.
M. E. J. Newman
Networks: an introduction
46