1. The document discusses a proposed technique called Fuzzy Based Improved Mutual Friend Crawling (Fmfc) for crawling online social networks. It aims to reduce bias introduced by the time taken for crawling the whole network.
2. The technique crawls all users within the same community first before moving to the next community, allowing researchers to selectively obtain users belonging to the same community. This is compared to existing mutual friend crawling.
3. The paper also provides a literature review of existing crawling techniques and studies of complex network properties relevant to community detection in networks. Future work in overlapping communities and performance evaluation on very large networks is discussed.
Scalable Local Community Detection with Mapreduce for Large NetworksIJDKP
Community detection from complex information networks draws much attention from both academia and
industry since it has many real-world applications. However, scalability of community detection algorithms
over very large networks has been a major challenge. Real-world graph structures are often complicated
accompanied with extremely large sizes. In this paper, we propose a MapReduce version called 3MA that
parallelizes a local community identification method which uses the $M$ metric. Then we adopt an
iterative expansion approach to find all the communities in the graph. Empirical results show that for large
networks in the order of millions of nodes, the parallel version of the algorithm outperforms the traditional
sequential approach to detect communities using the M-measure. The result shows that for local community
detection, when the data is too big for the original M metric-based sequential iterative expension approach
to handle, our MapReduce version 3MA can finish in a reasonable time.
COMMUNITY DETECTION USING INTER CONTACT TIME AND SOCIAL CHARACTERISTICS BASED...ijasuc
Delay Tolerant Networks (DTNs) where the node connectivity is opportunistic and end-to-end path between
any pair of source and destination is not guaranteed most of the time. Hence the messages are transferred
from source to destination via intermediate nodes on hop to hop basis using store-carry-forward paradigm.
Due to quick advancement in hand held devices such as smart phone and laptop with support of wireless
communication interface carried by human being, it is possible in coming days to use DTNs for message
dissemination without setting up infrastructure. The routing task becomes challenging in DTNs due to
intermittent network connectivity and the connection opportunity arises only when node comes in
transmission range of each other. The performance of the routing protocols depend on the selection of
appropriate relay node which can deliver the message to final destination in case of source and destination
do not meet at all. Many social characteristics are exhibited by the human being like friendship,
community, similarity and centrality which can be exploited by the routing protocol in order to take the
forwarding decisions. Literature shows that by using these characteristics, the performance of DTN routing
protocols have been improved in terms of delivery probability. The existing routing schemes used
community detection using aggregated contact duration and contact frequency which does not change over
the time period. We propose community detection through Inter Contact Time (ICT) between node pair
using power law distribution where the members of community are added and removed dynamically. We
also considered single copy of each message in entire network to reduce the network overhead. The
proposed routing protocol named Social Based Single Copy Routing (SBSCR) selects the suitable relay
node from the community members only based on the social metrics such as similarity and friendship
together. ICTs show power law nature in human mobility which is used to detect the community structure at
each node. A node maintains its own community and social metrics such as similarity and friendship with
other nodes. Whenever node has to select the relay node then it selects from its community with higher
value of social metric. The simulations are conducted using ONE simulator on the real traces of campus
and conference environments. SBSCR is compared with existing schemes and results show that it
outperforms in terms of delivery probability and delivery delay with comparable overhead ratio.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Community Detection in Networks Using Page Rank Vectors ijbbjournal
Nodes in the real world networks organize in the form of network communities. A community (also referred
to as module or cluster)is defined as where the links are denser inside the nodes and sparser outside the
nodes in the network. Communities in the networks also overlap because the nodes may belong to different
clusters at once. The task of detecting communities in networks becomes an open problem because of lack
of reliable algorithms. In practice all the existing community detection methods work good for nonoverlapping
communities and fail to detect communities with dense overlaps. We developed a novel method
for detecting communities by considering a single seed node. This method successfully captures the
overlapping networks ranging from social to information and from biological to citation networks. We
believe that the proposed system works well for the overlapping communities.
Scalable Local Community Detection with Mapreduce for Large NetworksIJDKP
Community detection from complex information networks draws much attention from both academia and
industry since it has many real-world applications. However, scalability of community detection algorithms
over very large networks has been a major challenge. Real-world graph structures are often complicated
accompanied with extremely large sizes. In this paper, we propose a MapReduce version called 3MA that
parallelizes a local community identification method which uses the $M$ metric. Then we adopt an
iterative expansion approach to find all the communities in the graph. Empirical results show that for large
networks in the order of millions of nodes, the parallel version of the algorithm outperforms the traditional
sequential approach to detect communities using the M-measure. The result shows that for local community
detection, when the data is too big for the original M metric-based sequential iterative expension approach
to handle, our MapReduce version 3MA can finish in a reasonable time.
COMMUNITY DETECTION USING INTER CONTACT TIME AND SOCIAL CHARACTERISTICS BASED...ijasuc
Delay Tolerant Networks (DTNs) where the node connectivity is opportunistic and end-to-end path between
any pair of source and destination is not guaranteed most of the time. Hence the messages are transferred
from source to destination via intermediate nodes on hop to hop basis using store-carry-forward paradigm.
Due to quick advancement in hand held devices such as smart phone and laptop with support of wireless
communication interface carried by human being, it is possible in coming days to use DTNs for message
dissemination without setting up infrastructure. The routing task becomes challenging in DTNs due to
intermittent network connectivity and the connection opportunity arises only when node comes in
transmission range of each other. The performance of the routing protocols depend on the selection of
appropriate relay node which can deliver the message to final destination in case of source and destination
do not meet at all. Many social characteristics are exhibited by the human being like friendship,
community, similarity and centrality which can be exploited by the routing protocol in order to take the
forwarding decisions. Literature shows that by using these characteristics, the performance of DTN routing
protocols have been improved in terms of delivery probability. The existing routing schemes used
community detection using aggregated contact duration and contact frequency which does not change over
the time period. We propose community detection through Inter Contact Time (ICT) between node pair
using power law distribution where the members of community are added and removed dynamically. We
also considered single copy of each message in entire network to reduce the network overhead. The
proposed routing protocol named Social Based Single Copy Routing (SBSCR) selects the suitable relay
node from the community members only based on the social metrics such as similarity and friendship
together. ICTs show power law nature in human mobility which is used to detect the community structure at
each node. A node maintains its own community and social metrics such as similarity and friendship with
other nodes. Whenever node has to select the relay node then it selects from its community with higher
value of social metric. The simulations are conducted using ONE simulator on the real traces of campus
and conference environments. SBSCR is compared with existing schemes and results show that it
outperforms in terms of delivery probability and delivery delay with comparable overhead ratio.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Community Detection in Networks Using Page Rank Vectors ijbbjournal
Nodes in the real world networks organize in the form of network communities. A community (also referred
to as module or cluster)is defined as where the links are denser inside the nodes and sparser outside the
nodes in the network. Communities in the networks also overlap because the nodes may belong to different
clusters at once. The task of detecting communities in networks becomes an open problem because of lack
of reliable algorithms. In practice all the existing community detection methods work good for nonoverlapping
communities and fail to detect communities with dense overlaps. We developed a novel method
for detecting communities by considering a single seed node. This method successfully captures the
overlapping networks ranging from social to information and from biological to citation networks. We
believe that the proposed system works well for the overlapping communities.
1. Basics of Social Networks
2. Real-world problem
3. How to construct graph from real-world problem?
4. What graph theory problem getting from real-world problem?
5. Graph type of Social Networks
6. Special properties in social graph
7. How to find communities and groups in social networks? (Algorithms)
8. How to interpret graph solution back to real-world problem?
DOMINANT FEATURES IDENTIFICATION FOR COVERT NODES IN 9/11 ATTACK USING THEIR ...IJNSA Journal
In recent days terrorism poses a threat to homeland security. The major problem faced in network analysis is to automatically identify the key player who can maximally influence other nodes in a large relational covert network. The existing centrality based and graph theoretic approach are more concerned about the network structure rather than the node attributes. In this paper an unsupervised framework SoNMine has been developed to identify the key players in 9/11 network using their behavioral profile. The behaviors of nodes are analyzed based on the behavioral profile generated. The key players are identified using the outlier analysis based on the profile and the highly communicating node is concluded to be the most influential person of the covert network. Further, in order to improve
the classification of a normal and outlier node, intermediate reference class R is generated. Based on these three classes the most dominating feature set is determined which further helps to accurately justify the outlier nodes.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.co¬m-Visit Our Website: www.finalyearprojects.org
Community detection from research papers (AAN dataset) using the algorithms:
K-Means
Louvain
Newman-Girvan
github link to code: https://goo.gl/CXej44
github link to project web page: http://goo.gl/7OOkhI
youtube link to video:https://goo.gl/SCpamf
dropbox link to ppt report video: https://goo.gl/cgACzU
An Improved PageRank Algorithm for Multilayer NetworksSubhajit Sahu
Highlighted notes on An Improved PageRank Algorithm for Multilayer Networks.
While doing research work under Prof. Kishore Kothapalli.
The authors use a layer-layer weight to provide an additional factor to PR which they then use in the 1st step of SpreadMax algorithm to show that their modified pagerank is better than standard PR in selecting key nodes. I didnt properly understand this.
Applications of networks:
Social networks, Air traffic systems, Transportation networks, Citation networks, Biological systems (infection spreading).
Multilayer networks can be thought of as a large graph subdivided into multiple groups (layers). This probably makes it easier to assign common weights to a particular group of nodes, edges.
The community detection in complex networks has attracted a growing interest and is the subject of several
researches that have been proposed to understand the network structure and analyze the network
properties. In this paper, we give a thorough overview of different community discovery strategies, we
propose taxonomy of these methods, and we specify the differences between the suggested classes which
helping designers to compare and choose the most suitable strategy for the various types of network
encountered in the real world.
1. Basics of Social Networks
2. Real-world problem
3. How to construct graph from real-world problem?
4. What graph theory problem getting from real-world problem?
5. Graph type of Social Networks
6. Special properties in social graph
7. How to find communities and groups in social networks? (Algorithms)
8. How to interpret graph solution back to real-world problem?
DOMINANT FEATURES IDENTIFICATION FOR COVERT NODES IN 9/11 ATTACK USING THEIR ...IJNSA Journal
In recent days terrorism poses a threat to homeland security. The major problem faced in network analysis is to automatically identify the key player who can maximally influence other nodes in a large relational covert network. The existing centrality based and graph theoretic approach are more concerned about the network structure rather than the node attributes. In this paper an unsupervised framework SoNMine has been developed to identify the key players in 9/11 network using their behavioral profile. The behaviors of nodes are analyzed based on the behavioral profile generated. The key players are identified using the outlier analysis based on the profile and the highly communicating node is concluded to be the most influential person of the covert network. Further, in order to improve
the classification of a normal and outlier node, intermediate reference class R is generated. Based on these three classes the most dominating feature set is determined which further helps to accurately justify the outlier nodes.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.co¬m-Visit Our Website: www.finalyearprojects.org
Community detection from research papers (AAN dataset) using the algorithms:
K-Means
Louvain
Newman-Girvan
github link to code: https://goo.gl/CXej44
github link to project web page: http://goo.gl/7OOkhI
youtube link to video:https://goo.gl/SCpamf
dropbox link to ppt report video: https://goo.gl/cgACzU
An Improved PageRank Algorithm for Multilayer NetworksSubhajit Sahu
Highlighted notes on An Improved PageRank Algorithm for Multilayer Networks.
While doing research work under Prof. Kishore Kothapalli.
The authors use a layer-layer weight to provide an additional factor to PR which they then use in the 1st step of SpreadMax algorithm to show that their modified pagerank is better than standard PR in selecting key nodes. I didnt properly understand this.
Applications of networks:
Social networks, Air traffic systems, Transportation networks, Citation networks, Biological systems (infection spreading).
Multilayer networks can be thought of as a large graph subdivided into multiple groups (layers). This probably makes it easier to assign common weights to a particular group of nodes, edges.
The community detection in complex networks has attracted a growing interest and is the subject of several
researches that have been proposed to understand the network structure and analyze the network
properties. In this paper, we give a thorough overview of different community discovery strategies, we
propose taxonomy of these methods, and we specify the differences between the suggested classes which
helping designers to compare and choose the most suitable strategy for the various types of network
encountered in the real world.
IOSR Journal of Pharmacy and Biological Sciences(IOSR-JPBS) is an open access international journal that provides rapid publication (within a month) of articles in all areas of Pharmacy and Biological Science. The journal welcomes publications of high quality papers on theoretical developments and practical applications in Pharmacy and Biological Science. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Community Detection in Networks Using Page Rank Vectors ijbbjournal
Nodes in the real world networks organize in the form of network communities. A community (also referred
to as module or cluster)is defined as where the links are denser inside the nodes and sparser outside the
nodes in the network. Communities in the networks also overlap because the nodes may belong to different
clusters at once. The task of detecting communities in networks becomes an open problem because of lack
of reliable algorithms. In practice all the existing community detection methods work good for nonoverlapping
communities and fail to detect communities with dense overlaps. We developed a novel method
for detecting communities by considering a single seed node. This method successfully captures the
overlapping networks ranging from social to information and from biological to citation networks. We
believe that the proposed system works well for the overlapping communities.
SCALABLE LOCAL COMMUNITY DETECTION WITH MAPREDUCE FOR LARGE NETWORKSIJDKP
Community detection from complex information networks draws much attention from both academia and
industry since it has many real-world applications. However, scalability of community detection algorithms
over very large networks has been a major challenge. Real-world graph structures are often complicated
accompanied with extremely large sizes. In this paper, we propose a MapReduce version called 3MA that
parallelizes a local community identification method which uses the $M$ metric. Then we adopt an
iterative expansion approach to find all the communities in the graph. Empirical results show that for large
networks in the order of millions of nodes, the parallel version of the algorithm outperforms the traditional
sequential approach to detect communities using the M-measure. The result shows that for local community
detection, when the data is too big for the original M metric-based sequential iterative expension approach
to handle, our MapReduce version 3MA can finish in a reasonable time.
To have the ability to “think outside the box” is generally regarded as something positive. At a moment in time when resources are scarce, and the problems facing us are many, innovation and professional excellence becomes a requirement, rather than a matter of choice. At the core of our attempts to come up with new, and better solutions are the digital technologies. Within the structural engineering context, the different types of off-the-shelf packages for finite element analysis play a central role. These “black-box” types of software packages exemplify how user-friendliness may have harmful consequences within a field where knowledge and the successful mastery of relevant skills is key, and consequently- ignorance may lead to fatal results. These tools make any effort “venturing outside” difficult to achieve. A technical paradigm shift is called for- that places learning and creative, informed exploration at the heart of the user experience. Presented during the Knowledge Based Engineering session of the 19th IABSE congress entitled "Challenges in Design and Construction of an Innovative and Sustainable Built Environment" held in Stockholm, September 21-23, 2016.
To have the ability to “think outside the box” is generally regarded as something positive. At a moment in time when resources are scarce, and the problems facing us are many, innovation and professional excellence becomes a requirement, rather than a matter of choice. At the core of our attempts to come up with new, and better solutions are the digital technologies. Within the structural engineering context, the different types of off-the-shelf packages for finite element analysis play a central role. These “black-box” types of software packages exemplify how user-friendliness may have harmful consequences within a field where knowledge and the successful mastery of relevant skills is key, and consequently- ignorance may lead to fatal results. These tools make any effort “venturing outside” difficult to achieve. A technical paradigm shift is called for- that places learning and creative, informed exploration at the heart of the user experience. Presented during the Knowledge Based Engineering session of the 19th IABSE congress held in Stockholm, September 21-23, 2016.
MODELING SOCIAL GAUSS-MARKOV MOBILITY FOR OPPORTUNISTIC NETWORK csandit
Mobility is attracting more and more interests due to its importance for data forwarding
mechanisms in many networks such as mobile opportunistic network. In everyday life mobile
nodes are often carried by human. Thus, mobile nodes’ mobility pattern is inevitable affected by
human social character. This paper presents a novel mobility model (HNGM) which combines
social character and Gauss-Markov process together. The performance analysis on this
mobility model is given and one famous and widely used mobility model (RWP) is chosen to
make comparison..
Finding prominent features in communities in social networks using ontologycsandit
Community detection is one of the major tasks in social networks. The success of any community
depends upon the features that were selected to form the community. So it is important to have
the knowledge of the main features that may affect the community. In this work we have
proposed a method to find prominent features based on which community can be formed.
Ontology has been used for the said purpose.
Complex systems,
Software systems,
Database systems,
Operating systems,
Bioinformatics systems,
Social Systems,
Service Oriented Systems,
Cloud Systems,
Ubiquitous systems,
Distributed Version Control Systems (GitHub), and
Software Container Systems (DockerHub and Google App Engine).
The Mathematics of Social Network Analysis: Metrics for Academic Social NetworksEditor IJCATR
Social network analysis plays an important role in analyzing social relations and patterns of interaction among actors in a
social network. Such networks can be casual, like those on social media sites, or formal, like academic social networks. Each of these
networks is characterised by underlying data which defines various features of the network. Keeping in view the size and diversity of
these networks it may not be possible to dissect entire network with conventional means. Social network visualization can be used to
graphically represent these networks in a concise and easy to understand manner. Social network visualization tools rely heavily on
quantitative features to numerically define various attributes of the network. These features also referred to as social network metrics
used everyday mathematics as their foundations. In this paper we provide an overview of various social network analysis metrics that
are commonly used to analyse social networks. Explanation of these metrics and their relevance for academic social networks is also
outlined
EVOLUTIONARY CENTRALITY AND MAXIMAL CLIQUES IN MOBILE SOCIAL NETWORKSijcsit
This paper introduces an evolutionary approach to enhance the process of finding central nodes in mobile networks. This can provide essential information and important applications in mobile and social networks. This evolutionary approach considers the dynamics of the network and takes into consideration the central nodes from previous time slots. We also study the applicability of maximal cliques algorithms in mobile social networks and how it can be used to find the central nodes based on the discovered maximal cliques. The experimental results are promising and show a significant enhancement in finding the central nodes.
COMMUNITY DETECTION USING INTER CONTACT TIME AND SOCIAL CHARACTERISTICS BASED...ijasuc
Delay Tolerant Networks (DTNs) where the node connectivity is opportunistic and end-to-end path between
any pair of source and destination is not guaranteed most of the time. Hence the messages are transferred
from source to destination via intermediate nodes on hop to hop basis using store-carry-forward paradigm.
Due to quick advancement in hand held devices such as smart phone and laptop with support of wireless
communication interface carried by human being, it is possible in coming days to use DTNs for message
dissemination without setting up infrastructure. The routing task becomes challenging in DTNs due to
intermittent network connectivity and the connection opportunity arises only when node comes in
transmission range of each other. The performance of the routing protocols depend on the selection of
appropriate relay node which can deliver the message to final destination in case of source and destination
do not meet at all. Many social characteristics are exhibited by the human being like friendship,
community, similarity and centrality which can be exploited by the routing protocol in order to take the
forwarding decisions. Literature shows that by using these characteristics, the performance of DTN routing
protocols have been improved in terms of delivery probability. The existing routing schemes used
community detection using aggregated contact duration and contact frequency which does not change over
the time period. We propose community detection through Inter Contact Time (ICT) between node pair
using power law distribution where the members of community are added and removed dynamically. We
also considered single copy of each message in entire network to reduce the network overhead. The
proposed routing protocol named Social Based Single Copy Routing (SBSCR) selects the suitable relay
node from the community members only based on the social metrics such as similarity and friendship
together. ICTs show power law nature in human mobility which is used to detect the community structure at
each node. A node maintains its own community and social metrics such as similarity and friendship with
other nodes. Whenever node has to select the relay node then it selects from its community with higher
value of social metric. The simulations are conducted using ONE simulator on the real traces of campus
and conference environments. SBSCR is compared with existing schemes and results show that it
outperforms in terms of delivery probability and delivery delay with comparable overhead ratio.
Community Detection Using Inter Contact Time and Social Characteristics Based...jake henry
Delay Tolerant Networks (DTNs) where the node connectivity is opportunistic and end-to-end path between
any pair of source and destination is not guaranteed most of the time. Hence the messages are transferred
from source to destination via intermediate nodes on hop to hop basis using store-carry-forward paradigm.
Due to quick advancement in hand held devices such as smart phone and laptop with support of wireless
communication interface carried by human being, it is possible in coming days to use DTNs for message
dissemination without setting up infrastructure. The routing task becomes challenging in DTNs due to
intermittent network connectivity and the connection opportunity arises only when node comes in
transmission range of each other. The performance of the routing protocols depend on the selection of
appropriate relay node which can deliver the message to final destination in case of source and destination
do not meet at all. Many social characteristics are exhibited by the human being like friendship,
community, similarity and centrality which can be exploited by the routing protocol in order to take the
forwarding decisions. Literature shows that by using these characteristics, the performance of DTN routing
protocols have been improved in terms of delivery probability. The existing routing schemes used
community detection using aggregated contact duration and contact frequency which does not change over
the time period. We propose community detection through Inter Contact Time (ICT) between node pair
using power law distribution where the members of community are added and removed dynamically. We
also considered single copy of each message in entire network to reduce the network overhead. The
proposed routing protocol named Social Based Single Copy Routing (SBSCR) selects the suitable relay
node from the community members only based on the social metrics such as similarity and friendship
together. ICTs show power law nature in human mobility which is used to detect the community structure at
each node. A node maintains its own community and social metrics such as similarity and friendship with
other nodes. Whenever node has to select the relay node then it selects from its community with higher
value of social metric. The simulations are conducted using ONE simulator on the real traces of campus
and conference environments. SBSCR is compared with existing schemes and results show that it
outperforms in terms of delivery probability and delivery delay with comparable overhead ratio.
ABSTRACT
This paper introduces an evolutionary approach to enhance the process of finding central nodes in mobile networks. This can provide essential information and important applications in mobile and social networks. This evolutionary approach considers the dynamics of the network and takes into consideration the central nodes from previous time slots. We also study the applicability of maximal cliques algorithms in mobile social networks and how it can be used to find the central nodes based on the discovered maximal cliques. The experimental results are promising and show a significant enhancement in finding the central nodes.
GRAPH ALGORITHM TO FIND CORE PERIPHERY STRUCTURES USING MUTUAL K-NEAREST NEIG...ijaia
Core periphery structures exist naturally in many complex networks in the real-world like social,
economic, biological and metabolic networks. Most of the existing research efforts focus on the
identification of a meso scale structure called community structure. Core periphery structures are another
equally important meso scale property in a graph that can help to gain deeper insights about the
relationships between different nodes. In this paper, we provide a definition of core periphery structures
suitable for weighted graphs. We further score and categorize these relationships into different types based
upon the density difference between the core and periphery nodes. Next, we propose an algorithm called
CP-MKNN (Core Periphery-Mutual K Nearest Neighbors) to extract core periphery structures from
weighted graphs using a heuristic node affinity measure called Mutual K-nearest neighbors (MKNN).
Using synthetic and real-world social and biological networks, we illustrate the effectiveness of developed
core periphery structures.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
D1803022335
1. IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 18, Issue 3, Ver. II (May-Jun. 2016), PP 23-35
www.iosrjournals.org
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 23 | Page
Fmfc: Fuzzy Based Improved Mutual Friend Crawling
Varnica
(M.Tech(CSE), Department of Computer Science and Engineering, GNDU Regional Campus, Gurdaspur,
Punjab, India.)
Abstract: Online Social networks are becoming an important part of research in every field. Crawling such
networks and extracting data from them is a very difficult task. Till date, many crawling techniques have been
designed for this purpose. The properties of complex networks have been reviewed in this paper. This paper has
focused on mutual friend crawling and applied it on a dataset. The proposed technique makes the use of fuzzy
logic and a comparison has been made between the existing and proposed technique. The effectiveness of the
proposed technique is evaluated using different parameters. This paper also includes a survey of the existing
methodologies designed and implemented by different authors. It also provides future directions in this area of
research.
Keywords: ACC, ADD, APL, BFS, Clustering Coefficient, Complex Networks, DFS, MFC, Overheads.
I. Introduction
The time taken to crawl the entire network using the existing web crawling techniques, introduces a
bias which is required to be minimized. The most common ways used for solving this problem are sampling
based on the nodes (users) ids in the network or crawling the network until one feels that an adequate amount of
data has been generated. In this paper, a new technique is proposed in which all the users which belong to the
same community are crawled first before moving on to the next community. This reduces the bias introduced by
the time taken for crawling the whole network. This technique enables a researcher to selectively obtain users
belonging to the same community and begin with the evaluation of the network.
1.1 Complex Network
Complex systems are basically networks which are organized into different compartments such that
each compartment has its own role and function to perform. All the compartments consist of nodes. The links
between nodes are of high density whereas the links between compartments are of low density [40].
A complex network represents the interactions in the real world in the form of a mathematical model. The major
problem that occurs in complex networks is the identification of the communities which are hidden in the
structure of these networks.
1.2 Communities in Complex Network
A complex network consists of a large number of communities connected to each other. A community
is a collection of a number of nodes connected to each other. The nodes in a particular community perform same
function and exhibit similar properties. The connectivity among the nodes in a network is more than the
connectivity between different communities. This means that more links exist among the nodes of a community
than between the nodes of different communities. There exist some networks which consist of overlapping
communities. Overlapping communities are the one in which nodes have participation in more than one
community.
For each type of network, communities play a very important role as each community depicts a
particular function of the network. In a network like internet, there exist communities consisting of web pages
(nodes) which are related to a particular topic. Another example can be a network of a city consisting of
communities formed by people which belong to the same society. Also, nodes in a community are placed
according to the role they perform like nodes which interact with the nodes of other communities are placed at
the boundary and nodes which control the activities of all the other nodes in the community are placed in the
centre.
Till date, no specific definition has been used for community in a network. Different authors use
different definitions of a community for their research work. Several generalizations have been used for
different definitions of communities like considering the properties of the network, considering the edge weights
of the weights of the network, considering the structure of the network etc. Detecting the communities of a
complex network is a very important task for studying the properties of the network. For this purpose, several
community detection algorithms have been designed by different authors.
2. Fmfc: Fuzzy Based Improved Mutual Friend Crawling
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 24 | Page
1.3 Properties of Complex Network
Complex networks are very popular and is an emerging topic for research. These networks exhibit
many properties which have been studied in the past. Some of these properties are:
1.3.1 Clustering Coefficient
Clustering coefficient is defined as the ratio of number of directed links that exist between the
neighbours of a node to the number of possible links that could exist between the neighbours of that node. The
clustering coefficient of a network is the average clustering coefficient of all the nodes in the network. [3]. For
ith node of a network, clustering coefficient can be calculated as,
C =
2e
k k − 1
Where, K is the number of neighbours of ith node and e is the number of edges between these
neighbours.
The value of clustering coefficient lies between 0 and 1. The higher value of clustering coefficient
indicates that there is higher degree of “cliquishness” between the nodes of the network. The „0‟ value of
clustering coefficient for a graph indicates that it has no “triangles” of connected nodes and if the value of
clustering coefficient for a graph is „1‟ then it is a perfect clique.
1.3.2 Degree Distribution
A network consists of large number of nodes and all the nodes have varying degree. Degree distribution
P(k) for a graph gives the probability of a randomly selected node to have a degree „k‟ in the network. It is used
to describe the distribution of the links among the nodes in the graph [28].
1.3.3 Assortativity
The assortativity coefficient „r‟ is a measure of the connectivity among the nodes in a network which
are common in some way like nodes with similar degree. Networks like social networks show assortativity as
highly connected nodes tend to be connected to nodes with high degree whereas networks like biological
networks show dissortativity as the nodes with high degree tend to be connected with nodes of low degree. The
assortativity coefficient r varies from 1 to -1. If r=1, it means that that the nodes tend to be connected to nodes
with similar degree but if r=-1, it means that the nodes tend to be connected to nodes with varying degree [3].
1.3.4 Density
The size of the network can be known from the total number of nodes in the network. Density is used
to define the level of linkage between the nodes in a network. It is generally calculated as the ratio of the
number of existing links between the nodes to the number of possible links. Mathematically, it can be written as,
Density=
2E
N N-1
Where, E is the edges of the network an N is the number of nodes in the network. For a complete
network i.e., for a network in which all the nodes are connected with each other, the value of density is 1 [3].
1.3.5 Community Structure
A community is a set of entities which are linked to all the other entities in the network. The entities in
one community perform the same function and share some common properties. A community structure reveals
the internal organization of the nodes. Different communities combine to form a complex network.
Networks like biological networks and social networks reveal modular structure [22]. These structures exhibit
more connections within a community than between different communities. The model proposed by Girvan and
Newman was the first model which generated networks with the property of community structure.
1.4 Mutual Friend Crawling
Mutual friend crawling (MFC) was introduced by Blenn N. et al. . This approach crawls all the nodes
of a network in such a way that all the communities are visited one after another. This algorithm assumes the
knowledge of the degree of the neighbouring nodes which is a very difficult task in online social networks. The
communities found using mutual friend crawling are smaller as compared to those found by other methods but
the properties are same[16]. MFC is generally based on breadth first search algorithm with two differences. The
first difference is that a map is used to store all the discovered nodes. The second difference is in the way the
next node is chosen to crawl the network. For this, a reference score is calculated[13]. Reference score of a node
is given by the ratio of number of discovered nodes to all the nodes which are linked to it. The list of all the
discovered nodes is prepared and the one having highest value of reference score is being processed next [13].
3. Fmfc: Fuzzy Based Improved Mutual Friend Crawling
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 25 | Page
Figure 1: Mutual Friend Crawling
The Fig. 1 shows that an entire community is crawled before crawling other connected communities.
Nodes are labelled based on the order of traversal during the crawling process. Different colours of the nodes
denote different communities.
This technique first crawls the nodes on that side of the network where more links have been detected.
If the community structure is based on the definition that links in a community are more than the links in
between communities then nodes in a network are visited one after another. This technique is also applicable on
weighted graphs. The only difference is how reference score is calculated. For weighted graphs, reference score
is given as the ratio of weights of found references to a node and the strength of that node [13].
II. Literature Survey
Blenn N. et al. (2012) [13], introduced a new way of crawling large online social networks. This
technique was named mutual friend crawling. It is compared with standard methods of crawling using breadth
first search and depth first search. This was the first analysis of crawling toward community structure. In this
method, the communities can be analyzed by the researchers even when the crawling process is running. In the
proposed algorithm, the communities are crawled one after another. As compared to breadth first crawling and
depth first crawling, this algorithm considers the degree of nodes for the crawling purpose. This algorithm uses a
parameter called reference score for selecting the nodes to be crawled. In the results it is found that the
performance of mutual friend crawling is better than BFS and DFS. The comparison of the proposed algorithm
is done with fast and greedy modularity maximizing method, random walk method and the Louvain method. For
this, three datasets have been used which are Zachary‟s karate club, Girvan and Newman‟s American college
football games and the Digg network. These methods are compared in terms of modularity and a metric which
defines the community structure representation of the network in graphical form and Jaccard similarity index.
Future work is required in terms of existence of overlapping communities in the network. Also more realistic
results are required to get accurate performance of the algorithm.
Gjoka M. et al. (2011) [17], introduced a practical framework for obtaining users of any online social
network using its social graph. Various crawling techniques have been studied like breadth first sampling and
random walk sampling. Both of these techniques result in a bias towards high degree nodes. They hay also
contributed for the use of formal convergence diagnostics which help us to determine when to stop sampling
even in the absence of ground truth. A comparison has been done between re-weighted random walk and
metropolis-hastings random walk. This comparison is made in terms of bias and efficiency. The results show
that re-weighted random walk gives better performance than metropolis-hastings random walk in terms of
efficiency. The implementation of high performance distributed crawler faced many challenges when applied to
an online social network because of which the data collection time has been kept very small. An offline
comparison is made between the ground truth and different crawling methods. Facebook has been used as the
case study and the implementation of these methods on it shows the structural properties of the users. A uniform
sample of facebook users is obtained which can be used for further research on crawling. The privacy awareness
of facebook has also been studied and the results show that about 84% of the users remain their settings
unchanged.
4. Fmfc: Fuzzy Based Improved Mutual Friend Crawling
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 26 | Page
Orman G. et al. (2011) [15], reviews the properties which describe the community structure of the
complex networks. The LFR benchmark has been studied and steps have been taken to improve the realism of
the networks generated by this benchmark. After the generation of networks, the community detection
algorithms have been applied and a study of community structure generated by these algorithms is done to
compare the results. The properties of the generated network has been analyzed out of which community size
and hub dominace are realistic properties, embeddedness and average distance are partially realistic properties
and scaled density is non-realistic property. The study is done on undirected and un-weighted networks and has
no overlapping communities. The observations show that the communities generated do not posses all the
properties as a real network does. The small communities are very dense but they should be star-shaped and the
large communities are not dense which shows the opposite characteristic of real world network. Different
algorithms like Louvain method, fast and greedy, Markov cluster, infomap etc. have been applied to the
generated network and their performance has been analyzed in terms of normalized mutual information. Some
of these algorithms have generated communities wth very small size and even having only single node. Infomap
has shown the best results while fast and greedy shows the least results.
Bas Van Kester (2011) [16], focuses on the problems of crawling online social network i.e., the
crawling process is very time consuming and also analyzing the networks through different community
detection methods is very expensive. A new crawling method called mutual friend crawling has been proposed
which allows the researchers to analyze the networks even before the crawling process is complete. The
proposed method is compared with the existing crawling methods like breadth first crawling and depth first
crawling. Two ways of obtaining social networks have been discussed which are networks which are
constructed manually by the researchers and online social networks. A brief overview of commonly used
community detection techniques is given like edge betweenness clustering used for small online social
networks, label propagation used for large online social networks, spinglass and fast and greedy community
detection used for networks with different sizes. Different measures like modularity, jacard index etc. have been
used for comparison of community detection algorithms. The datasets used are football network by Girvan and
Newman, Enron email network and computer generated cluster networks. The results of the algorithms
discussed are compared with the ground truth and it is seen that the spinglass community detection algorithm
and the ground truth are most similar than other algorithms. But this algorithm is more expensive than the
others. The results show that the proposed mutual friend crawling algorithm performs better than the exiting
breadth first and depth first crawling techniques. Future work is required for overlapping communities and
performance of very large online social networks needs to be analyzed.
Kumar S. et al. (2010) [23], has collaborated the concept of web crawler and web mining. The internet
is increasing at a very fast rate and so the process of web mining has become more important. The different
types of web mining have been discussed in detail. Web usage mining is a very recent topic of research which
estimates the behavior of the user when it is interacting with the web. A brief explanation has been given about
web content mining. Research has been done to find the type of data on which different mining techniques are
applicable. The use of different mining techniques like web structure mining, web content mining etc. has also
been detailed. The use of web crawlers in search engines is also given along with architecture of web crawler
which can be used by different users to extract web pages related to their topics of interest. The different types
of sources from which data is collected for the purpose of web mining are also explained.
Saramaki J. et al. (2007) [30], present the different generalizations on clustering coefficient which is a
very important property of complex networks. The different equations used for calculating clustering
coefficient, by different authors have been studied in detail. Also, their advantages and disadvantages have been
discussed which provide an idea of which method is appropriate according to requirement. The main focus has
been laid on the weighted clustering coefficient. According to the study, it is required that the weighted
clustering coefficient should also consider the weight present in the neighborhood of a particular vertex. The
different definitions have been studied on two networks i.e., international trade network and scientific
collaboration network. The results conclude that the topological nature of the network can be analyzed from
weighted clustering coefficient. The number of edges in the neighborhood can be measured but to calculate the
amount of weight in the neighborhood is still a question left for future research.
Latapy M. and Magnien C. (2008) [28], has introduced the first practical way to check the properties
of the complex networks with the growth in its graph. The changes in the properties have been observed till
stability is reached. Their approach has been used on real world networks which contradicts the previous
assumption used by different authors for measuring the properties of complex networks i.e., to collect data from
complex networks in proportion to the measuring ability and time limit of the methodology used. The amount of
data which was extracted was considered as representing whole of the complex network. Several datasets have
been used in this study like the inet dataset, the web dataset, the ip dataset and the p2p dataset. The properties
studied are size of the network, average degree, average density, clustering, degree distribution and transitivity.
In the results for the four datasets used, it has been observed that some properties are stable and some are
5. Fmfc: Fuzzy Based Improved Mutual Friend Crawling
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 27 | Page
unstable as a result of which there is no inter-dependency among the properties. The graphical representation for
all these properties has been shown for the four datasets used in the paper. They depict clear variation among the
values of the properties like for inet dataset the average degree graph shows that the values increase with the
increase in the size of the network whereas for the web dataset, stability is observed for average degree.
Therefore it has been concluded that for evaluation on small part of the network, if the property is stable then
same behavior can be assumed for the whole of the network.
Pavalam S. M. et al. (2011) [18], has reviewed the different web crawling algorithms being used for
crawling data from different networks. The type of web crawling algorithm should be chosen according to the
requirement of the search procedure. The process of web crawling has been described which includes the
selection of target web pages, selection of a seed node from where the crawling process is to start, priorities of
different web pages, revisited web pages etc. different algorithms like breadth first search, depth first search,
random walk, best first search, genetic algorithm etc. have been detailed. The use of these algorithms and the
strategy followed by them is reviewed so that the readers can easily identify the algorithm which best suits its
requirements. The advantages and disadvantages of the algorithms are also mentioned in the survey. The use of
these algorithms by different authors is also reviewed. It has been concluded that the genetic algorithm gives
best results among all the other algorithms discussed.
Rosvall M. et al. (2007) [31], have presented an article related to the community structure of complex
networks. A basic framework has been designed for identifying the communities of a complex network. The
graphical working of this framework has also been described. A signaler has been used in the framework which
has complete knowledge about the network and divides the network into different modules so that more and
more information can be extracted from the network. Many social networks have been used to apply their
approach and evaluate the results. The results are then compared with the previous approaches that have been
adopted by many authors. The designed approach is very useful for the networks in which the communities are
of same size and degree. To check the performance different benchmark tests have been applied.
Reichardt J. and Bornholdt S. (2006) [33], have used weighted and directed networks for their work.
They have designed a general framework which can be used for solving community detection problems. This
framework has been particularly developed for finding community of a desired node of the network rather than
finding all the communities of the network. Work has also being done on studying the overlap in the community
structure of the network. The framework has been very helpful for large networks as finding communities of
such a network is a very time consuming task. Two measures have been used by the framework which can be
used for comparison with other algorithms implemented by different authors for community detection in
complex networks. The performance of the framework has been evaluated by using benchmarks. The
relationship between graph portioning and community detection has also been focused. Different definitions of
community have been found by studying the literature by different authors. The properties of the networks
which can be evaluated from these definitions have also been discussed. The co-authorship network has also
been studied.
Danon L. et al. (2005) [36], have focused on different methodologies which are used for community
detection in complex networks. Identifying the communities of a network and then characterizing them is a very
important task in community detection. Much progress has been achieved in this field. But still, many issues
exist such as the implementation of such techniques remains very costly. The method of community detection
should be chosen based on the requirements of the authors. Some authors consider accuracy as the basic
requirement and some consider the concept of overlapping communities as the basic requirement etc. Different
methods have been compared considering the cost that is incurred on the application of such methods on
complex methods. Emphasis has been laid on the different factors which should be considered for choosing an
algorithm. Some criteria like accuracy, sensitivity, computational effort, speed etc. are very important while
selecting a method for community detection. Speed has been mostly considered as a factor for when applying a
method on large networks. It has been concluded that more efforts are required to find a method which is fast
enough in detecting communities in complex network.
III. Proposed Technique
The proposed technique will use fuzzy based clustering to obtain best communities therefore, is also
detecting communities during runtime. The proposed Fuzzy Based Improved Mutual Friend Crawling
(FMFC) has been designed and tested using MATLAB tool to evaluate the effectiveness of the proposed
technique over the existing one.
Algorithm:-
1. Load football dataset.
2. Evaluate the size of data i.e., M for number of rows and N for number of columns.
3. Define complex network nodes i.e., N= 𝑀 + 𝑁
6. Fmfc: Fuzzy Based Improved Mutual Friend Crawling
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 28 | Page
4. Define edges for nodes
K=N/3
Therefore, network is 33% connected.
5. Evaluate triangular membership function from football dataset and store them in y1, y2 and y3 respectively.
6. Evaluate fuzzy relationship value
R =
N + 1 ∑y2 + ∑y3
(1 + ∑y1)
7. Construct a queue Q.
8. Insert the starting node in Q.
9. Store the initial node and 0 as the number of found references in R
10. While Q is not empty do
11. For all elements in R calculate the reference score
12. reference_score = value in R/degree of the node
13. max_score= max(max_score, reference_score )
14. end for
15. select the element having maximum score
16. next_node ← dequeue element having max_scorefrom Q
17. delete next_node from R
18. if next_node has not been visited yet then
19. add all the neighbors of next_node to queue
20. increment number of found references to neighbor by 1and store it in R
21. remember that node (next_node) was visited
22. end if
23. end while
IV. Experimental Evaluation
4.1 Results of Existing Technique
In this section, i present the results obtained by implementing the technique called mutual friend
crawling in MATLAB. For this, the dataset used is “American College Football Games” of the year 2000 and is
gathered from the internet. The nodes in the network represent the different teams and the edges show the link
between the teams. There exists a link between the teams if they have played a match against each other.
To check the performance, three parameters have been considered i.e., execution time, overheads and
number of clusters. Also, the properties of complex networks i.e., average path length, average clustering
coefficient and average degree distribution has been studied.
Table 1: Evaluation of the properties of complex network from a well known dataset
Average Path
Length (APL)
Average Clustering
Coefficient (ACC)
Average Degree
Distribution (ADD)
Execution
Time
1 1 14 1.5358
1.2857 0.68889 10 2.0030
1.7143 0.6 6 2.3252
1.7143 0.6 6 2.6526
2.2857 0.5 4 2.9919
The Table 1 shows the different values of the properties along with the execution time. The properties
derived are average path length, average clustering coefficient and average degree distribution. It can be seen
very clearly that the value of average path length is increasing with the decrease in the density of the network.
Also, the value of average clustering coefficient and average degree distribution is decreasing with the network
becoming less dense.
In the results, the complex network with different values of APL, ACC and ADD is shown in figures 2,
3, 4 and 5. With the changing values of the properties, the density of the network also changes. The nodes are
represented with pink colour showing different teams in the dataset and the links between them is represented
with blue colour edges.
7. Fmfc: Fuzzy Based Improved Mutual Friend Crawling
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 29 | Page
Figure 2: Complex Network with APL=1, ACC=1 and ADD=14
Fig. 2 is the most dense network. The nodes in pink are the different teams of the dataset which are
linked to each other edges in blue colour. The average path length for this graph is 1, average clustering
coefficient is 1 and the average degree distribution is 14.
Figure 3: Complex Network with APL=1.2857, ACC=0.68889 and ADD=10
Fig. 3 is less dense as compared to Fig. 8 with the varying values of the properties. Also the execution
time (0.8134) is observed to be more than the execution time for the previous graph (0.4806) as shown in the
Table 1
.
Figure 4: Complex Network with APL=1.7143, ACC=0.6 and ADD=6
8. Fmfc: Fuzzy Based Improved Mutual Friend Crawling
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 30 | Page
Fig. 4 is less dense than previous graphs Two such graphs are observed.. The value of the properties for
both the graphs is observed to be same but the execution time varies as shown in the Table.
Figure 5: Complex Network with APL=2.2857, ACC=0.5 and ADD=4
Fig. 5 is the least dense graph. It has been observed that the value of degree distribution is least and the
value of average path length is the highest for this graph. The execution time also comes up to be the highest for
this graph.
Figure 6: Degree of Distribution=14
Fig. 6 shows the resultant graph of degree distribution for the network in Fig. 2. The degree of
distribution is observed to be the highest for this network.
Figure 7: Degree of Distribution=10
9. Fmfc: Fuzzy Based Improved Mutual Friend Crawling
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 31 | Page
Fig. 7 shows the graph of degree distribution for the network in Fig. 3.
Figure 8: Degree of Distribution=6
Fig. 8 shows the graph of degree distribution for the network in Fig. 4. Here again two similar graphs
are observed for the same value of degree of distribution.
Figure 9: Degree of Distribution=4
Fig. 9 shows the graph of degree distribution for the network in Fig. 5. The degree of distribution is
observed to be the least for this network and the execution time is the highest.
Figure 10: Average Degree Distribution Using Existing Technique
10. Fmfc: Fuzzy Based Improved Mutual Friend Crawling
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 32 | Page
Fig. 10 shows the graph of average degree distribution for the whole network. P(k) is the probability of the
number of nodes having degree „k‟.
Figure 11: Number of Clusters and Overheads Using Existing Technique
4.2 Results of Proposed Technique
The following figures are the results of the proposed technique i.e., fuzzy based improved mutual
friend crawling.
Table 2: Evaluation of the Properties of Complex Network by Applying Fuzzy Logic in Mutual Friend
Crawling
Average Path Length
(APL)
Average Clustering Coefficient
(ACC)
Average Degree Distribution
(ADD)
Execution
Time
0.85 0.85 2.15 0.6090
The Table 2 shows the values of average path length, average clustering coefficient, average degree
distribution and execution time found using fuzzy logic in the existing technique. The proposed technique i.e.,
FMFC makes use of fuzzy membership function to generate the network of the football dataset. The different
values derived are shown in this Table.
Figure 12: Complex Network of Football Dataset
This is the graph of complex network for the football dataset which is similar to the previous graph
generated using the existing technique. After applying fuzzy logic in mutual friend crawling, no change is found
in the representation of the network. It consists of 15 nodes which represent the communities of the network.
11. Fmfc: Fuzzy Based Improved Mutual Friend Crawling
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 33 | Page
Figure 13: Degree of Distribution of Football Dataset
The Fig. 13 shows the distribution of the network having degree 14. The x-axis represents the time of
distribution of the edges and the y-axis shows the edges.
Figure 14: Average Degree Distribution Using Proposed Technique
`Fig. 14 shows the graph of probability of degree distribution of the network. This graph shows better
distribution by using fuzzy logic as compared to the previous graph of existing technique. The existing
technique gives distribution at degree 4, which is the least degree for the network, whereas the proposed
technique gives distribution at degree 14 which is the maximum degree of the network.
Figure 15: Membership Value of Edges
12. Fmfc: Fuzzy Based Improved Mutual Friend Crawling
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 34 | Page
Figure 16: Number of Clusters and Overheads Using Proposed Technique
Table 3 Comparison of Existing and Proposed MFC
The Table 3 shows the comparison between the existing mutual friend crawling technique and fuzzy
based improved mutual friend crawling technique. The comparison is based on three parameters i.e., number of
clusters, overheads and execution time. The number of clusters found using existing MFC are 3. This is a very
small number. Compared to this, the number of clusters using FMFC is 18, which is a more profitable value.
The value of overheads for existing MFC is very high i.e., 2,6606 whereas when fuzzy logic is applied in MFC,
the overheads are reduced to a value 0.2218. This means we get better performance in fuzzy based MFC. The
execution time of existing technique is found to be very high i.e., 2.9919. This means that more time is
consumed in crawling the network of the dataset used. Using fuzzy logic, the execution time is reduced to
0.6090. Therefore, less time is used in crawling the network. Hence, using fuzzy logic in MFC proves to be
beneficial in terms of number of clusters, overheads and execution time.
V. Conclusion
The different web crawling algorithms are used for crawling data from different networks. The type of
web crawling algorithm should be chosen according to the requirement of the search procedure. In this paper the
process of web crawling has been described which includes the selection of target web pages, selection of a seed
node from where the crawling process is to start, priorities of different web pages, revisited web pages etc.
Different algorithms like breadth first search, depth first search, random walk, best first search, genetic
algorithm etc. are used in the literature. The use of these algorithms and the strategy followed by them is
reviewed so that the readers can easily identify the algorithm which best suits its requirements. Also, the
different properties of complex network like assortativity, community structure, degree distribution, clustering
coefficient, density etc have been studied and explained in detail.
VI. Future Scope
This work has not considered the use of genetic algorithm to enhance the results further, but many
researchers has shown that the genetic based web crawling technique outperforms over the available techniques.
Therefore in near future a genetic algorithm based mutual friend crawling will be proposed to enhance the best
community detection rate.
References
[1] Yadav A. and Singh P., “Web Crawl Detection and Analysis of Semantic Data”, International Journal of Computer Trends and
Technology, Volume 21, Number 1, ISSN: 2231-2803 (2015).
[2] Ahuja M., Singh J. and Varnica, “Web Crawler: Extracting the Web Data”, International Journal of Computer Trends and
Technology, Volume 13, number 3, ISSN: 2231-2803(2014).
[3] Mini Singh Ahuja and Jatinder Singh, “Future Prospects in Community Detection”, International Journal of Computer Science
Engineering and Information Technology Research, Vol. 4, Issue 5, ISSN: 2249-6831 (2014).
[4] Mcphail M., “The Statistical Properties of Complex Networks”, Australian Mathematical Sciences Institute (2014).
[5] Janbandhu R., Dahiwale P. and Raghuwanshi M., “Analysis of Web Crawling Algorithms”, International Journal on Recent and
Innovation Trends in Computing and Communication, Volume 2, Issue 3, ISSN: 488-492 (2014).
[6] Kumar R., Jain A. and Agrawal C., “Survey of Web Crawling Algorithms”, Advances in Vision Computing: An International
Journal, Volume 1 (2014).
[7] Yan H. et al., “Community detection using global and local structural information”, Pramana – J. Phys., Vol. 80, No. 1 (2013).
Sr. No. Parameter Existing Technique Proposed Technique
1. Number of Clusters 3 18
2. Overheads 2.6606 0.2218
3. Execution Time 2.9919 0.6090
13. Fmfc: Fuzzy Based Improved Mutual Friend Crawling
DOI: 10.9790/0661-1803022335 www.iosrjournals.org 35 | Page
[8] Iswary R. and Nath K., “Web Crawler”, International Journal of Advanced Research in Computer and Communication Engineering,
Vol. 2, Issue 10, ISSN: 2278-1021 (2013).
[9] Yang J. and Leskovec J., “Defining and Evaluating Network Communities Based on Ground-Truth”, Springer, DOI
10.1007/s10115-013-0693-z (2013).
[10] Orman G., Labatut V. and Cherifi H., “Towards Realistic Artificial Benchmark for Community Detection Algorithms”,
International Journal of Web Based Communities, pp. 349-370 (2013).
[11] Nath R. and Chopra K., “Web Crawlers: Taxonomy, Issues and Challenges”, International Journal of Advanced Research in
Computer Science and Software Engineering, Volume 3, Issue 4, pp. 944-948 (2013).
[12] Kausar M., V. S. Dhaka and Singh S., “Web Crawler: A Review”, International Journal of Computer Applications (0975-8887),
Volume 63, Number 2 (2013).
[13] Blenn N., Doerr C., Kester B. and Mieghem P., “Crawling and Detecting Community Structure in Online Social Networks using
Local Information”, Springer, LNCS 7289, pp. 56-67 (2012).
[14] Khurana D. and Kumar S., “Web Crawler: A Review”, International Journal of Computer Science & Management Studies, Vol. 12,
Issue 01, ISSN: 2231 –5268 (2012).
[15] Orman G., Labatut V. and Cherifi H., “Qualitative Comparison of Community Detection Algorithms”. International DICTAP 2011,
DOI 10.1007/978-3-642-22027-2_23 (2011).
[16] Bas Van Kester, “Efficient crawling of community structures in online social networks”, PVM 2011-071, Tu Delft (2011).
[17] Gjoka M., Kurant M., Butts C. and Markopoulou A., “Practical Recommendations on Crawling Online Social Networks”, IEEE
Journal on Selected Areas in Communications, Vol. 29, No. 9 (2011).
[18] Pavalam S. M., Raja S., Akorli F. and Jawahar M., “A Survey of Web Crawler Algorithms”, International Journal of Computer
Science Issues, Volume 8, Issue 6, ISSN: 1694-0814 (2011).
[19] Meo P., Nocera A., Terracina G. and Ursino D., “Recommendation of Similar Users, Resources and Social Networks in a Social
Internetworking Scenario”, Information Sciences, Volume 181, Number 7, pp. 1285-1305 (2011).
[20] Olston C. and Najork M., “Web Crawling”, NOW The Essence of Knowledge, Vol. 4, No. 3 (2010) 175–246 (2010).
[21] Coscia M., Giannotti F. and Pedreschi D., “A Classification for Community Discovery Methods in Complex Networks”, Wiley
Online Library, Volume 4, DOI:10.1002 (2010).
[22] Fortunato S., “Community Detection in Graphs”, Physics Reports, 486(3-5):75 (2010).
[23] Kumar Pani S., Mohapatra D. and Keshari Ratha B., “Integration of Web Mining and Web Crawler: Relevance and State of Art”,
International Journal on Computer Science and Engineering, Volume 2, No. 3, 772-776 (2010).
[24] Lancichinetti A. and Fortunato S., “Benchmarks for testing community detection algorithms on directed and weighted graphs with
overlapping communities”, Physical Review E80, 016118 (2009).
[25] Lancichinetti A. and Fortunato S., “Community detection algorithms: A comparative analysis”, Physical Review E 80, 056117
(2009).
[26] Orman G. and Labatut V., “A Comparison of Community Detection Algorithms on Artificial Networks”, Discovery Science, Porto :
Portugal, DOI : 10.1007/978-3-642-04747-3_20 (2009).
[27] Lancichinetti A., Fortunato S. and Radicchi F., “Benchmark graphs for testing community detection algorithms”, Physical Review E
78, 046110 9 (2008).
[28] Latapy M. and Magnien C., “Complex Network Measurements: Estimating the Relevance of Observed Properties”, IEEE
INFOCOM, Phoenix, USA, pp. 2333-2341 (2008).
[29] Blondel V., Guillaume J. and Lambiotte R., “Fast Unfolding of Communities in Large Networks”, Journal of Statistical Mechanics:
Theory and Experiment, P10008 (2008).
[30] Saramaki J., Kivela M., Onnela J., Kaski K. and Kertesz J., “Generalizations of the Clustering Coefficient to Weighted Complex
Networks”, Physical Review E 75, 027107 (2007).
[31] Rosvall M. and Carl T. Bergstrom, “An Information-Theoretic Framework for Resolving Community Structure in Complex
Networks”, The National Academy of Sciences of the USA, Volume 104, No. 18, 7327-7331 (2007).
[32] Boccaletti S., Latora V., Moreno Y., Chavez M. and D.-U. Hwang, “Complex Networks: Structure and Dynamics”, Elsevier,
Physics Reports 424, DOI: 10.1016, 175-308 (2006).
[33] Reichardt J. and Bornholdt S., “Statistical Mechanics of Community Detection”, Physical Review E74, 016110 (2006).
[34] Trusina A., “Complex Networks: Structure, Function, Evolution”, Umea University, ISBN: 91-7305-924-2 (2005).
[35] Cotta C. and Merelo J., “The Complex Network of Evolutionary Computation Authors: an Initial Study”, arXiv: physics/0507196v2
(2005).
[36] Danon L., Diaz-Guilera A., Duch J. and Arenas A., “Comparing Community Structure Identification”, Journal of Statistical
Mechanics: Theory and Experiment, DOI: 10.1088/1742-5468/2005/09/P09008 (2005).
[37] Pant G., Srinivasan P. and Menczer F., “Crawling the Web”, Springer Berlin Heidberg, ISSN: 978-3-662-10874-1 (2004).
[38] M.E.J. Newman, “The Structure and Function of Complex Networks”, SIAM Review 45, 167-256 (2003).
[39] Girvan M. and M. E. J. Newman, “Community Structure in Social and Biological Networks”, National Academy of Sciences of the
United States of America, Volume 99, Number 12, pp. 7821-7826 (2002).
[40] Wang Z., “Community Detection Approaches In Complex Networks: A Review”, Department of Applied Mathematics, Fudan
University.
[41] Varnica and Mini Singh Ahuja, “Studying the Properties of Complex Network Crawled Using MFC”, International Research
Journal of Engineering and Technology, Volume 2, Issue 4, e-ISSN: 2395-0056 (2015).