Social Networks has become one of the most popular platforms to allow users to communicate, and share
their interests without being at the same geographical location. With the great and rapid growth of Social
Media sites such as Facebook, LinkedIn, Twitter...etc. causes huge amount of user-generated content.
Thus, the improvement in the information quality and integrity becomes a great challenge to all social
media sites, which allows users to get the desired content or be linked to the best link relation using
improved search / link technique. So introducing semantics to social networks will widen up the
representation of the social networks.
POLITICAL OPINION ANALYSIS IN SOCIAL NETWORKS: CASE OF TWITTER AND FACEBOOK dannyijwest
The 21st century has been characterized by an increased attention to social networks. Nowadays, going 24
hours without getting in touch with them in some way has become difficult. Facebook and Twitter, these
social platforms are now part of everyday life. Thus, these social networks have become important sources
to be aware of frequently discussed topics or public opinions on a current issue. A lot of people write
messages about current events, give their opinion on any topic and discuss social issues more and more.
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGIJwest
Social Networks has become one of the most popular platforms to allow users to communicate, and share their interests without being at the same geographical location. With the great and rapid growth of Social Media sites such as Facebook, LinkedIn, Twitter…etc. causes huge amount of user-generated content. Thus, the improvement in the information quality and integrity becomes a great challenge to all social media sites, which allows users to get the desired content or be linked to the best link relation using improved search / link technique. So introducing semantics to social networks will widen up the representation of the social networks. In this paper, a new model of social networks based on semantic tag ranking is introduced. This model is based on the concept of multi-agent systems. In this proposed model the representation of social links will be extended by the semantic relationships found in the vocabularies which are known as (tags) in most of social networks.The proposed model for the social media engine is based on enhanced Latent Dirichlet Allocation(E-LDA) as a semantic indexing algorithm, combined with Tag Rank as social network ranking algorithm. The improvements on (E-LDA) phase is done by optimizing (LDA) algorithm using the optimal parameters. Then a filter is introduced to enhance the final indexing output. In ranking phase, using Tag Rank based on the indexing phase has improved the output of the ranking. Simulation results of the proposed model have shown improvements in indexing and ranking output.
POLITICAL OPINION ANALYSIS IN SOCIAL NETWORKS: CASE OF TWITTER AND FACEBOOKIJwest
The 21st century has been characterized by an increased attention to social networks. Nowadays, going 24 hours without getting in touch with them in some way has become difficult. Facebook and Twitter, these social platforms are now part of everyday life. Thus, these social networks have become important sources to be aware of frequently discussed topics or public opinions on a current issue. A lot of people write messages about current events, give their opinion on any topic and discuss social issues more and more.
Temporal Exploration in 2D Visualization of Emotions on Twitter StreamTELKOMNIKA JOURNAL
As people freely express their opinions toward a product on Twitter streams without being bound
by time, visualizing time pattern of customers emotional behavior can play a crucial role in decisionmaking.
We analyze how emotions are fluctuated in pattern and demonstrate how we can explore it into
useful visualizations with an appropriate framework. We manually customized the current framework in
order to improve a state-of-the-art of crawling and visualizing Twitter data. The data, post or update on
status on the Twitter website about iPhone, was collected from U.S.A, Japan, Indonesia, and Taiwan by
using geographical bounding-box and visualized it into two-dimensional heat map, interactive stream
graph, and context focus via brushing visualization. The results show that our proposed system can
explore uniqueness of temporal pattern of customers emotional behavior.
Current trends of opinion mining and sentiment analysis in social networkseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
POLITICAL OPINION ANALYSIS IN SOCIAL NETWORKS: CASE OF TWITTER AND FACEBOOK dannyijwest
The 21st century has been characterized by an increased attention to social networks. Nowadays, going 24
hours without getting in touch with them in some way has become difficult. Facebook and Twitter, these
social platforms are now part of everyday life. Thus, these social networks have become important sources
to be aware of frequently discussed topics or public opinions on a current issue. A lot of people write
messages about current events, give their opinion on any topic and discuss social issues more and more.
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGIJwest
Social Networks has become one of the most popular platforms to allow users to communicate, and share their interests without being at the same geographical location. With the great and rapid growth of Social Media sites such as Facebook, LinkedIn, Twitter…etc. causes huge amount of user-generated content. Thus, the improvement in the information quality and integrity becomes a great challenge to all social media sites, which allows users to get the desired content or be linked to the best link relation using improved search / link technique. So introducing semantics to social networks will widen up the representation of the social networks. In this paper, a new model of social networks based on semantic tag ranking is introduced. This model is based on the concept of multi-agent systems. In this proposed model the representation of social links will be extended by the semantic relationships found in the vocabularies which are known as (tags) in most of social networks.The proposed model for the social media engine is based on enhanced Latent Dirichlet Allocation(E-LDA) as a semantic indexing algorithm, combined with Tag Rank as social network ranking algorithm. The improvements on (E-LDA) phase is done by optimizing (LDA) algorithm using the optimal parameters. Then a filter is introduced to enhance the final indexing output. In ranking phase, using Tag Rank based on the indexing phase has improved the output of the ranking. Simulation results of the proposed model have shown improvements in indexing and ranking output.
POLITICAL OPINION ANALYSIS IN SOCIAL NETWORKS: CASE OF TWITTER AND FACEBOOKIJwest
The 21st century has been characterized by an increased attention to social networks. Nowadays, going 24 hours without getting in touch with them in some way has become difficult. Facebook and Twitter, these social platforms are now part of everyday life. Thus, these social networks have become important sources to be aware of frequently discussed topics or public opinions on a current issue. A lot of people write messages about current events, give their opinion on any topic and discuss social issues more and more.
Temporal Exploration in 2D Visualization of Emotions on Twitter StreamTELKOMNIKA JOURNAL
As people freely express their opinions toward a product on Twitter streams without being bound
by time, visualizing time pattern of customers emotional behavior can play a crucial role in decisionmaking.
We analyze how emotions are fluctuated in pattern and demonstrate how we can explore it into
useful visualizations with an appropriate framework. We manually customized the current framework in
order to improve a state-of-the-art of crawling and visualizing Twitter data. The data, post or update on
status on the Twitter website about iPhone, was collected from U.S.A, Japan, Indonesia, and Taiwan by
using geographical bounding-box and visualized it into two-dimensional heat map, interactive stream
graph, and context focus via brushing visualization. The results show that our proposed system can
explore uniqueness of temporal pattern of customers emotional behavior.
Current trends of opinion mining and sentiment analysis in social networkseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
The Mathematics of Social Network Analysis: Metrics for Academic Social NetworksEditor IJCATR
Social network analysis plays an important role in analyzing social relations and patterns of interaction among actors in a
social network. Such networks can be casual, like those on social media sites, or formal, like academic social networks. Each of these
networks is characterised by underlying data which defines various features of the network. Keeping in view the size and diversity of
these networks it may not be possible to dissect entire network with conventional means. Social network visualization can be used to
graphically represent these networks in a concise and easy to understand manner. Social network visualization tools rely heavily on
quantitative features to numerically define various attributes of the network. These features also referred to as social network metrics
used everyday mathematics as their foundations. In this paper we provide an overview of various social network analysis metrics that
are commonly used to analyse social networks. Explanation of these metrics and their relevance for academic social networks is also
outlined
Big Data Social Network Analysis (BDSNA) is the focal computational and graphical
study of powerful techniques that can be used to identify clusters, patterns, hidden
structures, generate business intelligence, in social relationships within social networks
in terms of network theory. Social Network Analysis (SNA) has a diversified set of
applications and research areas such as Health care, Travel and Tourism, Defence and
Security, Internet of Things (IoT) etc. . . With the boom of the internet, Web 2.0
and handheld devices, there is an explosive growth in size, complexity and variety in
unstructured data, thus the analysis and information extraction is of great value and
adaptation of Big Data concept to SNA is vital.
This literature survey aims to investigate the usefulness of SNA in the “Big Data
(BD)” arena. This survey report reviews major research studies that have proposed
business strategies, BD approaches to generate predictive models by gratifying contemporary
challenges that have arises from SNA.
Fuzzy AndANN Based Mining Approach Testing For Social Network AnalysisIJERA Editor
Fast and Appropriate Social Network Analysis (SNA) tools ,techniques, are required to collect and classify
opinion scores on social networksites , as a grouping on wrong opinion may create problems for a society or
country . Social Network Analysis (SNA) is popular means for researcher as the number of users and groups
increasing day by day on that social sites , and a large group may influence other.In this paper, we
recommendhybrid model of opinion recommendation systems, for single user and for collective community
respectively, formed on social liking and influence network theory. By collecting thedata of user social networks
and preferenceslike, we designed aimproved hybrid prototype to imitate the social influence by like and sharing
the information among groups.The significance of this paper to analyze the suitability of ANN and Fuzzy sets
method in a hybrid manner for social web sites classifications, First, we intend to use Artificial Neural
Network(ANN)techniques in social media data classification by using some contemporary methods different
than the conventional methods of statistics and data analysis, in next we want to propagate the fuzzy approach
as a way to overcome the uncertainity that is always present in social media analysis . We give a brief overview
of the main ideas and recent results of social networks analysis , and we point to relationships between the two
social network analysis and classification approaches .This researchsuggests a hybrid classification model build
on fuzzy and artificial neural network (HFANN). Information Gain and three popular social sites are used to
collect information depicting features that are then used to train and test the proposed methods . This neoteric
approach combines the advantages of ANN and Fuzzy sets in classification accuracy with utilizing social data
and knowledge base available in the hate lexicons.
A LINK-BASED APPROACH TO ENTITY RESOLUTION IN SOCIAL NETWORKScsandit
Social networks initially had been places for people to contact each other, find friends or new
acquaintances. As such they ever proved interesting for machine aided analysis. Recent
developments, however, pivoted social networks to being among the main fields of information
exchange, opinion expression and debate. As a result there is growing interest in both analyzing
and integrating social network services. In this environment efficient information retrieval is
hindered by the vast amount and varying quality of the user-generated content. Guiding users to
relevant information is a valuable service and also a difficult task, where a crucial part of the
process is accurately resolving duplicate entities to real-world ones. In this paper we propose a
novel approach that utilizes the principles of link mining to successfully extend the methodology
of entity resolution to multitype problems. The proposed method is presented using an
illustrative social network-based real-world example and validated by comprehensive
evaluation of the results.
Mining and Analyzing Academic Social NetworksEditor IJCATR
Academics establish relationships by way of various interactions like jointly authoring a research paper or report, jointly
supervising a thesis, working jointly on a project, etc. Some of these relationships are ubiquitous whereas other are hard to keep track
of. Of all types of possible academic and research collaborations, co-authorship is best documented. In this paper we analyze the coauthorship
based academic social networks of computer science engineering departments of Indian Institutes of Technology (IITs) as
evidenced from their research publications produced during 2011 and 2015. We use social network analysis metrics to study the
collaboration networks in four leading IITs. From experimental results it can be concluded that IIT Delhi and IIT Kharagpur have a
close knit collaboration network whereas the collaboration network of IIT Kanpur and IIT Madras is fragmented. However, the
collaboration networks of all the four IITs exhibit similar network properties as expected from any other collaboration network
Clustering in Aggregated User Profiles across Multiple Social Networks IJECEIAES
A social network is indeed an abstraction of related groups interacting amongst themselves to develop relationships. However, toanalyze any relationships and psychology behind it, clustering plays a vital role. Clustering enhances the predictability and discoveryof like mindedness amongst users. This article’s goal exploits the technique of Ensemble Kmeans clusters to extract the entities and their corresponding interestsas per the skills and location by aggregating user profiles across the multiple online social networks. The proposed ensemble clustering utilizes known K-means algorithm to improve results for the aggregated user profiles across multiple social networks. The approach produces an ensemble similarity measure and provides 70% better results than taking a fixed value of K or guessing a value of K while not altering the clustering method. This paper states that good ensembles clusters can be spawned to envisage the discoverability of a user for a particular interest.
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAkevig
In this era of technology, enormous Online Social Networking Sites (OSNs) have arisen as a medium of
expressing any opinions, thoughts towards anything even support their status against any social or
political matter at the same time. Nowadays, people connected to those networks are more likely to prefer
to employ themselves utilizing these online platforms to exhibit their standings upon any political
organizations participating in the election throughout the whole election period. The aim of this paper is to
predict the outcome of the election by engaging the tweets posted on Twitter pertaining to the Australian
federal election-2019 held on May 18, 2019. We aggregated two efficacious techniques in order to extract
the information from the tweet data to count a virtual vote for each corresponding political group. The
original results of the election closely match the findings of our investigation, published by the Australian
Electoral Commission.
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERINGijaia
Unsupervised machine learning techniques such as clustering are widely gaining use with the recent increase in social communication platforms like Twitter and Facebook. Clustering enables the finding of patterns in these unstructured datasets. We collected tweets matching hashtags linked to COVID-19 from a Kaggle dataset. We compared the performance of nine clustering algorithms using this dataset. We evaluated the generalizability of these algorithms using a supervised learning model. Finally, using a selected unsupervised learning algorithm we categorized the clusters. The top five categories are Safety,
Crime, Products, Countries and Health. This can prove helpful for bodies using large amount of Twitter data needing to quickly find key points in the data before going into further classification.
The Mathematics of Social Network Analysis: Metrics for Academic Social NetworksEditor IJCATR
Social network analysis plays an important role in analyzing social relations and patterns of interaction among actors in a
social network. Such networks can be casual, like those on social media sites, or formal, like academic social networks. Each of these
networks is characterised by underlying data which defines various features of the network. Keeping in view the size and diversity of
these networks it may not be possible to dissect entire network with conventional means. Social network visualization can be used to
graphically represent these networks in a concise and easy to understand manner. Social network visualization tools rely heavily on
quantitative features to numerically define various attributes of the network. These features also referred to as social network metrics
used everyday mathematics as their foundations. In this paper we provide an overview of various social network analysis metrics that
are commonly used to analyse social networks. Explanation of these metrics and their relevance for academic social networks is also
outlined
Big Data Social Network Analysis (BDSNA) is the focal computational and graphical
study of powerful techniques that can be used to identify clusters, patterns, hidden
structures, generate business intelligence, in social relationships within social networks
in terms of network theory. Social Network Analysis (SNA) has a diversified set of
applications and research areas such as Health care, Travel and Tourism, Defence and
Security, Internet of Things (IoT) etc. . . With the boom of the internet, Web 2.0
and handheld devices, there is an explosive growth in size, complexity and variety in
unstructured data, thus the analysis and information extraction is of great value and
adaptation of Big Data concept to SNA is vital.
This literature survey aims to investigate the usefulness of SNA in the “Big Data
(BD)” arena. This survey report reviews major research studies that have proposed
business strategies, BD approaches to generate predictive models by gratifying contemporary
challenges that have arises from SNA.
Fuzzy AndANN Based Mining Approach Testing For Social Network AnalysisIJERA Editor
Fast and Appropriate Social Network Analysis (SNA) tools ,techniques, are required to collect and classify
opinion scores on social networksites , as a grouping on wrong opinion may create problems for a society or
country . Social Network Analysis (SNA) is popular means for researcher as the number of users and groups
increasing day by day on that social sites , and a large group may influence other.In this paper, we
recommendhybrid model of opinion recommendation systems, for single user and for collective community
respectively, formed on social liking and influence network theory. By collecting thedata of user social networks
and preferenceslike, we designed aimproved hybrid prototype to imitate the social influence by like and sharing
the information among groups.The significance of this paper to analyze the suitability of ANN and Fuzzy sets
method in a hybrid manner for social web sites classifications, First, we intend to use Artificial Neural
Network(ANN)techniques in social media data classification by using some contemporary methods different
than the conventional methods of statistics and data analysis, in next we want to propagate the fuzzy approach
as a way to overcome the uncertainity that is always present in social media analysis . We give a brief overview
of the main ideas and recent results of social networks analysis , and we point to relationships between the two
social network analysis and classification approaches .This researchsuggests a hybrid classification model build
on fuzzy and artificial neural network (HFANN). Information Gain and three popular social sites are used to
collect information depicting features that are then used to train and test the proposed methods . This neoteric
approach combines the advantages of ANN and Fuzzy sets in classification accuracy with utilizing social data
and knowledge base available in the hate lexicons.
A LINK-BASED APPROACH TO ENTITY RESOLUTION IN SOCIAL NETWORKScsandit
Social networks initially had been places for people to contact each other, find friends or new
acquaintances. As such they ever proved interesting for machine aided analysis. Recent
developments, however, pivoted social networks to being among the main fields of information
exchange, opinion expression and debate. As a result there is growing interest in both analyzing
and integrating social network services. In this environment efficient information retrieval is
hindered by the vast amount and varying quality of the user-generated content. Guiding users to
relevant information is a valuable service and also a difficult task, where a crucial part of the
process is accurately resolving duplicate entities to real-world ones. In this paper we propose a
novel approach that utilizes the principles of link mining to successfully extend the methodology
of entity resolution to multitype problems. The proposed method is presented using an
illustrative social network-based real-world example and validated by comprehensive
evaluation of the results.
Mining and Analyzing Academic Social NetworksEditor IJCATR
Academics establish relationships by way of various interactions like jointly authoring a research paper or report, jointly
supervising a thesis, working jointly on a project, etc. Some of these relationships are ubiquitous whereas other are hard to keep track
of. Of all types of possible academic and research collaborations, co-authorship is best documented. In this paper we analyze the coauthorship
based academic social networks of computer science engineering departments of Indian Institutes of Technology (IITs) as
evidenced from their research publications produced during 2011 and 2015. We use social network analysis metrics to study the
collaboration networks in four leading IITs. From experimental results it can be concluded that IIT Delhi and IIT Kharagpur have a
close knit collaboration network whereas the collaboration network of IIT Kanpur and IIT Madras is fragmented. However, the
collaboration networks of all the four IITs exhibit similar network properties as expected from any other collaboration network
Clustering in Aggregated User Profiles across Multiple Social Networks IJECEIAES
A social network is indeed an abstraction of related groups interacting amongst themselves to develop relationships. However, toanalyze any relationships and psychology behind it, clustering plays a vital role. Clustering enhances the predictability and discoveryof like mindedness amongst users. This article’s goal exploits the technique of Ensemble Kmeans clusters to extract the entities and their corresponding interestsas per the skills and location by aggregating user profiles across the multiple online social networks. The proposed ensemble clustering utilizes known K-means algorithm to improve results for the aggregated user profiles across multiple social networks. The approach produces an ensemble similarity measure and provides 70% better results than taking a fixed value of K or guessing a value of K while not altering the clustering method. This paper states that good ensembles clusters can be spawned to envisage the discoverability of a user for a particular interest.
PREDICTING ELECTION OUTCOME FROM SOCIAL MEDIA DATAkevig
In this era of technology, enormous Online Social Networking Sites (OSNs) have arisen as a medium of
expressing any opinions, thoughts towards anything even support their status against any social or
political matter at the same time. Nowadays, people connected to those networks are more likely to prefer
to employ themselves utilizing these online platforms to exhibit their standings upon any political
organizations participating in the election throughout the whole election period. The aim of this paper is to
predict the outcome of the election by engaging the tweets posted on Twitter pertaining to the Australian
federal election-2019 held on May 18, 2019. We aggregated two efficacious techniques in order to extract
the information from the tweet data to count a virtual vote for each corresponding political group. The
original results of the election closely match the findings of our investigation, published by the Australian
Electoral Commission.
CATEGORIZING 2019-N-COV TWITTER HASHTAG DATA BY CLUSTERINGijaia
Unsupervised machine learning techniques such as clustering are widely gaining use with the recent increase in social communication platforms like Twitter and Facebook. Clustering enables the finding of patterns in these unstructured datasets. We collected tweets matching hashtags linked to COVID-19 from a Kaggle dataset. We compared the performance of nine clustering algorithms using this dataset. We evaluated the generalizability of these algorithms using a supervised learning model. Finally, using a selected unsupervised learning algorithm we categorized the clusters. The top five categories are Safety,
Crime, Products, Countries and Health. This can prove helpful for bodies using large amount of Twitter data needing to quickly find key points in the data before going into further classification.
RUNNING HEADER: Analytics Ecosystem 1
Analytics Ecosystem 4
Analytics Ecosystem
Lisa Garay
Rasmussen College
Authors Note
This paper is being submitted for Anastasia Rashtchian’s B288 Business Analytics Course.
This paper looks at the nine clusters of the ecosystem. Clustering refers to a system of grouping functions that are similar so as to set them out from others. It begins by highlighting them before proceeding to defining them. It then identifies clusters that represent technology developers and technology users. Peer reviewed materials are used in this endeavor.
They include executive sponsor cluster which contains information that concerns administrators for directing the system. Another one is end-user tools and dashboards cluster that is made of functions that facilitate ability of persons to ultimately engage the system. Data owners cluster is made up of programs that are related to persons who have data in the system. Business users’ cluster is made up of functions that are related to clients of the system. Business applications and systems cluster is made up programs related to features of a given system. Developers cluster is made of programs that are related to the development of programs in the system. Analyst cluster is made up of materials that are related to analysis of data in the system. SME cluster that is made up switches that run SME applications in the system. Lastly, operational data stores that are made up of programs that are concerned with storage of data in a system (Pitelis, 2012).
While developers cluster is made up of technology developers in the system, business users’ cluster is made up of technology users in the system. In conclusion, clustering serves to bring roles together as well as separating roles that are not related in a system (Cameron, Gelbach & Miller, 2012).
They can be represented as follows:-
References
Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2012). Robust inference with multiway clustering. Journal of Business & Economic Statistics.
Pitelis, C. (2012). Clusters, entrepreneurial ecosystem co-creation, and appropriability: a conceptual framework. Industrial and Corporate Change, dts008.
Infrastructure
Executive Sponsor Cluster
End-user tools and dashboards cluster
operational data stores
Data Owners Cluster
Business users' cluster
Business systems and applications cluster
Developers Cluster
Analysts Cluster
SME cluster
4
Running head: Sentiment analysis
Sentiment Analysis
Lisa Garay
Rasmussen College
Authors Note
This paper is being submitted for Anastashia Rashtcian’s B288 Business Analytics course.
Sentiment analysis has played a significant role in the concurrent marketing field, specifically in product marketing. According to Somasundaran, Swapna, (2010), the process’ operational module is structured on a data mining sequence, whereby the end users of given particulars the feedback pertaining a used.
Graph-based Analysis and Opinion Mining in Social NetworkKhan Mostafa
This is the final report for Networks & Data Mining Techniques project focusing on mining social network to estimate public opinion about entities and associated keywords. This project mines Twitter for recent feeds and analyzes them to estimate sentiment score, discussed entity and describing keywords in each tweet. This data is then exploited to elicit overall sentiment associated with each entity. Entities and keywords extracted is also used to form an entity-keyword bigraph. This graph is further used to detect entity communities and keywords found within those communities. Presented implementation works in linear time.
Frequent Item set Mining of Big Data for Social MediaIJERA Editor
Big data is a term for massive data sets having large, more varied and complex structure with the difficulties of storing, analyzing and visualizing for further processes or results. Bigdata includes data from email, documents, pictures, audio, video files, and other sources that do not fit into a relational database. This unstructured data brings enormous challenges to Bigdata.The process of research into massive amounts of data to reveal hidden patterns and secret correlations named as big data analytics. Therefore, big data implementations need to be analyzed and executed as accurately as possible. The proposed model structures the unstructured data from social media in a structured form so that data can be queried efficiently by using Hadoop MapReduce framework. The Bigdata mining is essential in order to extract value from massive amount of data. MapReduce is efficient method to deal with Big data than traditional techniques.The proposed Linguistic string matching Knuth-Morris-Pratt algorithm and K-Means clustering algorithm gives proper platform to extract value from massive amount of data and recommendation for user.Linguistic matching techniques such as Knuth–Morris–Pratt string matching algorithm are very useful in giving proper matching output to user query. The K-Means algorithm is one which works on clustering data using vector space model. It can be an appropriate method to produce recommendation for user.
Frequent Item set Mining of Big Data for Social MediaIJERA Editor
Big data is a term for massive data sets having large, more varied and complex structure with the difficulties of storing, analyzing and visualizing for further processes or results. Bigdata includes data from email, documents, pictures, audio, video files, and other sources that do not fit into a relational database. This unstructured data brings enormous challenges to Bigdata.The process of research into massive amounts of data to reveal hidden patterns and secret correlations named as big data analytics. Therefore, big data implementations need to be analyzed and executed as accurately as possible. The proposed model structures the unstructured data from social media in a structured form so that data can be queried efficiently by using Hadoop MapReduce framework. The Bigdata mining is essential in order to extract value from massive amount of data. MapReduce is efficient method to deal with Big data than traditional techniques.The proposed Linguistic string matching Knuth-Morris-Pratt algorithm and K-Means clustering algorithm gives proper platform to extract value from massive amount of data and recommendation for user.Linguistic matching techniques such as Knuth–Morris–Pratt string matching algorithm are very useful in giving proper matching output to user query. The K-Means algorithm is one which works on clustering data using vector space model. It can be an appropriate method to produce recommendation for user
Profile Analysis of Users in Data Analytics DomainDrjabez
Data Analytics and Data Science is in the fast forward
mode recently. We see a lot of companies hiring people for data
analysis and data science, especially in India. Also, many
recruiting firms use stackoverflow to fish their potential
candidates. The industry has also started to recruit people based
on the shapes of expertise. Expertise of a personal is
metaphorically outlined by shapes of letters like I, T, M and
hyphen betting on her experiencein a section (depth) and
therefore the variety of areas of interest (width).This proposal
builds upon the work of mining shapes of user expertise in a
typical online social Question and Answer (Q&A) community
where expert users often answer questions posed by other
users.We have dealt with the temporal analysis of the expertise
among the Q&A community users in terms how the user/ expert
have evolved over time.
Keywords— Shapes of expertise, Graph communities, Expertise
evolution, Q&A community
Data-to-text technologies present an enormous and exciting opportunity to help
audiences understand some of the insights present in today’s vasts and growing amounts of electronic
data. In this article we analyze the potential value and benefits of these solutions as well as their risks
and limitations for a wider penetration. These technologies already bring substantial advantages of
cost, time, accuracy and clarity versus other traditional approaches or format. On the other hand,
there are still important limitations that restrict the broad applicability of these solutions, most
importantly in the limited quality of their output. However we find that the current state of
development is sufficient for the application of these solution across many domains and use cases and
recommend businesses of all sectors to consider how to deploy them to enhance the value they are
currently getting from their data. As the availability of data keeps growing exponentially and natural
language generation technology keeps improving, we expect data-to-text solutions to take a much
more bigger role in the production of automated content across many different domains.
Over recent years, big data, a huge amount of structured and unstructured data is generated from social Network. There needs to extract the valulable information from the social big data. The traditional analytic platform needs to be scaled up for analyzing social big data in an efficient and timely manner. Sentiment Analysis of social big data helps the organizations by providing business insights with public opinion. Sentiment analysis based on multi-class classification scheme is oriented towards classification of text into more detailed sentiment labels. Multi-class classification with single tier architecture where single model is developed and entire labeled data is trained may increase the classification complexity. In this paper, multi-tier sentiment analysis system on big data analytics platform (MSABDP) is proposed to reduce the multi class classification complexity and efficiently analyze large scale data set. Hadoop is built for big data analytics and it is a good platform for being able to manage large data at scale and which can improve scalability and efficiency by adopting distributed processing environment since they have been implemented using a MapReduce framework and a Hadoop distributed storage (HDFS). The MSABDP is implemented by combining SentiStrength lexicon and learning based classification scheme with multi-tier architecture and run on big data analytics platform for being able to manage large data at scale. The proposed system collects a large amount of real Twitter data by using Apache Flume and the data was used for evaluation. The evaluation results have shown that the proposed multi class classification system with multi-tier architecture is able to significantly improve the classification accuracy over multi class classification based on single-tier architecture by 7%.
Recommender systems: a novel approach based on singular value decompositionIJECEIAES
Due to modern information and communication technologies (ICT), it is increasingly easier to exchange data and have new services available through the internet. However, the amount of data and services available increases the difficulty of finding what one needs. In this context, recommender systems represent the most promising solutions to overcome the problem of the so-called information overload, analyzing users' needs and preferences. Recommender systems (RS) are applied in different sectors with the same goal: to help people make choices based on an analysis of their behavior or users' similar characteristics or interests. This work presents a different approach for predicting ratings within the model-based collaborative filtering, which exploits singular value factorization. In particular, rating forecasts were generated through the characteristics related to users and items without the support of available ratings. The proposed method is evaluated through the MovieLens100K dataset performing an accuracy of 0.766 and 0.951 in terms of mean absolute error and root-mean-square error.
A simplified classification computational model of opinion mining using deep ...IJECEIAES
Opinion and attempts to develop an automated system to determine people's viewpoints towards various units such as events, topics, products, services, organizations, individuals, and issues. Opinion analysis from the natural text can be regarded as a text and sequence classification problem which poses high feature space due to the involvement of dynamic information that needs to be addressed precisely. This paper introduces effective modelling of human opinion analysis from social media data subjected to complex and dynamic content. Firstly, a customized preprocessing operation based on natural language processing mechanisms as an effective data treatment process towards building quality-aware input data. On the other hand, a suitable deep learning technique, bidirectional long short term-memory (Bi-LSTM), is implemented for the opinion classification, followed by a data modelling process where truncating and padding is performed manually to achieve better data generalization in the training phase. The design and development of the model are carried on the MATLAB tool. The performance analysis has shown that the proposed system offers a significant advantage in terms of classification accuracy and less training time due to a reduction in the feature space by the data treatment operation.
Semantic domain ontologies are increasingly seen as the key for enabling
interoperability across heterogeneous systems and sensor-based applications.
The ontologies deployed in these systems and applications are developed by
restricted groups of domain experts and not by semantic web experts. Lately,
folksonomies are increasingly exploited in developing ontologies. The
“collective intelligence”, which emerge from collaborative tagging can be
seen as an alternative for the current effort at semantic web ontologies.
However, the uncontrolled nature of social tagging systems leads to many
kinds of noisy annotations, such as misspellings, imprecision and ambiguity.
Thus, the construction of formal ontologies from social tagging data remains
a real challenge. Most of researches have focused on how to discover
relatedness between tags rather than producing ontologies, much less domain
ontologies. This paper proposed an algorithm that utilises tags in social
tagging systems to automatically generate up-to-date specific-domain
ontologies. The evaluation of the algorithm, using a dataset extracted from
BibSonomy, demonstrated that the algorithm could effectively learn a
domain terminology, and identify more meaningful semantic information for
the domain terminology. Furthermore, the proposed algorithm introduced a
simple and effective method for disambiguating tags.
IJWEST CFP (9).pdfCALL FOR ARTICLES...! IS INDEXING JOURNAL...! Internationa...dannyijwest
Paper Submission
Authors are invited to submit papers for this journal through Email: ijwestjournal@airccse.org / ijwest@aircconline.com or through Submission System.
mportant Dates
Submission Deadline : June 01, 2024
Notification :July 01, 2024
Final Manuscript Due : July 08, 2024
Publication Date : Determined by the Editor-in-Chief
Here's where you can reach us : ijwestjournal@yahoo.com or ijwestjournal@airccse.org or ijwest@aircconline.com
Cybercrimes in the Darknet and Their Detections: A Comprehensive Analysis and...dannyijwest
Although the Dark web was originally used for maintaining privacy-sensitive communication for business or intelligence services for defence, government and business organizations, fighting against censorship and blocked content, later, the advantage of technologies behind the Dark web were abused by criminals to conduct crimes which involve drug dealing to the contract of assassinations in a widespread manner. Since the communication remains secure and untraceable, criminals can easily use dark web service via The Onion Router (TOR), can hide their illegal motives and can conceal their criminal activities. This makes it very difficult to monitor and detect cybercrimes over the dark web. With the evolution of machine learning, natural language processing techniques, computational big data applications and hardware, there is a growing interest in exploiting dark web data to monitor and detect criminal activities. Due to the anonymity provided by the Dark Web, the rapid disappearance and the change of the uniform resource locators (URLs) of the resources, it is not as easy to crawl the Drak web and get the data as the usual surface web which limits the researchers and law enforcement agencies to analyse the data. Therefore, there is an urgent need to study the technology behind the Dark web, its widespread abuse, its impact on society and the existing systems, to identify the sources of drug deal or terrorism activities. In this research, we analysed the predominant darker sides of the world wide web (WWW), their volumes, their contents and their ratios. We have performed the analysis of the larger malicious or hidden activities that occupy the major portions of the Dark net; tools and techniques used to identify cybercrimes which happen inside the dark web. We applied a systematic literature review (SLR) approach on the resources where the actual dark net data have been used for research purposes in several areas. From this SLR, we identified the approaches (tools and algorithms) which have been applied to analyse the Dark net data, the key gaps as well as the key contributions of the existing works in the literature. In our study, we find the main challenges to crawl the dark web and collect forum data are: scalability of crawler, content selection trade off, and social obligation for TOR crawler and the limitations of techniques used in automatic sentiment analysis to understand criminals’ forums and thereby monitor the forums. From the comprehensive analysis of existing tools, our study summarizes the most tools. However the forum topics rapidly change as their sources changes; criminals inject noises to obfuscate the forum’s main topic and thus remain undetectable. Therefore supervised techniques fail to address the above challenges. Semi-supervised techniques would be an interesting research direction.
FFO: Forest Fire Ontology and Reasoning System for Enhanced Alert and Managem...dannyijwest
Forest fires or wildfires pose a serious threat to property, lives, and the environment. Early detection and mitigation of such emergencies, therefore, play an important role in reducing the severity of the impact caused by wildfire. Unfortunately, there is often an improper or delayed mechanism for forest fire detection which leads to destruction and losses. These anomalies in detection can be due to defects in sensors or a lack of proper information interoperability among the sensors deployed in forests. This paper presents a lightweight ontological framework to address these challenges. Interoperability issues are caused due to heterogeneity in technologies used and heterogeneous data created by different sensors. Therefore, through the proposed Forest Fire Detection and Management Ontology (FFO), we introduce a standardized model to share and reuse knowledge and data across different sensors. The proposed ontology is validated using semantic reasoning and query processing. The reasoning and querying processes are performed on real-time data gathered from experiments conducted in a forest and stored as RDF triples based on the design of the ontology. The outcomes of queries and inferences from reasoning demonstrate that FFO is feasible for the early detection of wildfire and facilitates efficient process management subsequent to detection.
Call For Papers-10th International Conference on Artificial Intelligence and ...dannyijwest
** Registration is currently open **
Call for Research Papers!!!
Free – Extended Paper will be published as free of cost.
10th International Conference on Artificial Intelligence and Applications (AI 2024)
July 20 ~ 21, 2024, Toronto, Canada
https://csty2024.org/ai/index
Submission Deadline: May 11, 2024
Contact Us
Here's where you can reach us : ai@csty2024.org or ai.conference@yahoo.com
Submission System
https://csty2024.org/submission/index.php
#artificialintelligence #softcomputing #machinelearning #technology #datascience #python #deeplearning #tech #robotics #innovation #bigdata #coding #iot #computerscience #data #dataanalytics #engineering #robot #datascientist #software #automation #analytics #ml #pythonprogramming #programmer #digitaltransformation #developer #promptengineering #generativeai #genai #chatgpt
CALL FOR ARTICLES...! IS INDEXING JOURNAL...! International Journal of Web &...dannyijwest
Paper Submission
Authors are invited to submit papers for this journal through Email: ijwest@aircconline.com or through Submission System. Submissions must be original and should not have been published previously or be under consideration for publication while being evaluated for this Journal.
Important Dates
• Submission Deadline: March 16, 2024
• Notification : April 13, 2024
• Final Manuscript Due : April 20, 2024
• Publication Date : Determined by the Editor-in-Chief
Contact Us
Here's where you can reach us
ijwestjournal@yahoo.com or ijwestjournal@airccse.org or ijwest@aircconline.com
Submission URL : https://airccse.com/submissioncs/home.html
ENHANCING WEB ACCESSIBILITY - NAVIGATING THE UPGRADE OF DESIGN SYSTEMS FROM W...dannyijwest
ENHANCING WEB ACCESSIBILITY - NAVIGATING THE UPGRADE OF DESIGN SYSTEMS FROM WCAG 2.0 TO WCAG 2.1
Hardik Shah
Department of Information Technology, Rochester Institute of Technology, USA
ABSTRACT
In this research, we explore the vital transition of Design Systems from Web Content Accessibility
Guidelines (WCAG) 2.0 to WCAG 2.1, emphasizing its role in enhancing web accessibility and inclusivity
in digital environments. The study outlines a comprehensive strategy for achieving WCAG 2.1 compliance,
encompassing assessment, strategic planning, implementation, and testing, with a focus on collaboration
and user involvement. It also addresses the challenges in using web accessibility tools, such as their
complexity and the dynamic nature of accessibility standards. The paper looks forward to the integration
of emerging technologies like AI, ML, NLP, VR, and AR in accessibility tools, advocating for universal
design and user-centered approaches. This research acts as a crucial guide for organizations aiming to
navigate the changing landscape of web accessibility, underscoring the importance of continuous learning
and adaptation to maintain and enhance accessibility in digital platforms.
KEYWORDS
Web accessibility, WCAG 2.1, Design Systems, Web accessibility tools, Artificial Intelligence
PDF LINK:https://aircconline.com/ijwest/V15N1/15124ijwest01.pdf
VOLUME LINK:https://www.airccse.org/journal/ijwest/vol15.html
OTHER INFORMATION:https://www.airccse.org/journal/ijwest/ijwest.html
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSEDuvanRamosGarzon1
AIRCRAFT GENERAL
The Single Aisle is the most advanced family aircraft in service today, with fly-by-wire flight controls.
The A318, A319, A320 and A321 are twin-engine subsonic medium range aircraft.
The family offers a choice of engines
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42
Maintaining high-quality standards in the production of TMT bars is crucial for ensuring structural integrity in construction. Addressing common defects through careful monitoring, standardized processes, and advanced technology can significantly improve the quality of TMT bars. Continuous training and adherence to quality control measures will also play a pivotal role in minimizing these defects.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Vaccine management system project report documentation..pdfKamal Acharya
The Division of Vaccine and Immunization is facing increasing difficulty monitoring vaccines and other commodities distribution once they have been distributed from the national stores. With the introduction of new vaccines, more challenges have been anticipated with this additions posing serious threat to the already over strained vaccine supply chain system in Kenya.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
1. International Journal of Web & Semantic Technology (IJWesT) Vol.9, No.3, July 2018
DOI : 10.5121/ijwest.2018.9301 1
INTELLIGENT SOCIAL NETWORKS MODEL BASED
ON SEMANTIC TAG RANKING
Rushdi Hamamerh 1
and SamehAwad 2
1
Department of Computer Engineering, Al-Quds University, Abu Dies, Palestine
2
Department of Information Technology, Birzeit University, Ramallah, Palestine
ABSTRACT
Social Networks has become one of the most popular platforms to allow users to communicate, and share
their interests without being at the same geographical location. With the great and rapid growth of Social
Media sites such as Facebook, LinkedIn, Twitter…etc. causes huge amount of user-generated content.
Thus, the improvement in the information quality and integrity becomes a great challenge to all social
media sites, which allows users to get the desired content or be linked to the best link relation using
improved search / link technique. So introducing semantics to social networks will widen up the
representation of the social networks.
In this paper, a new model of social networks based on semantic tag ranking is introduced. This model is
based on the concept of multi-agent systems. In this proposed model the representation of social links will
be extended by the semantic relationships found in the vocabularies which are known as (tags) in most of
social networks.The proposed model for the social media engine is based on enhanced Latent Dirichlet
Allocation(E-LDA) as a semantic indexing algorithm, combined with Tag Rank as social network ranking
algorithm. The improvements on (E-LDA) phase is done by optimizing (LDA) algorithm using the optimal
parameters. Then a filter is introduced to enhance the final indexing output. In ranking phase, using Tag
Rank based on the indexing phase has improved the output of the ranking. Simulation results of the
proposed model have shown improvements in indexing and ranking output.
KEYWORDS
SocialNetwork, Multi-Agent Systems, Semantic Indexing, Tag Rank, LDA, E-LDA.
1. INTRODUCTION
Social networks are emerging field in information interchange, worldwide used and wanted. It is
a challenging subject to do a research in social media field as it was and still affecting us in every
aspect of our lives [1].
Ellison and Boyd defined social networks (SN) as web-based services that allow users to build a
public or semi-public profile within a system, connect to a list of other users by sharing a
connection, and view and extend their list of connections and those made by others within the
system. The nature of these connections may vary from (SN) site to another [2].
In current social networks, Links between contents are constructed by many ranking techniques
according to the way to deal with data, importance and priority of data. Such as posts in
Facebook, hashtags in Twitter, Job and Experiences in LinkedIn, etc. and so data must be ranked
in a way that links constructing the social graph will reflect natural distribution and connection
between nodes of the social networks. Rank of each node is given by making iterative process of
weights in network. In Semantic Social Networks, this weight can be given according to semantic
content of the social network node.
Semantic Content of Social Network which is large and complex collections of data and that is
known nowadays as “Big Data” [3] must be indexed before ranking process. This can be achieved
2. International Journal of Web & Semantic Technology (IJWesT) Vol.9, No.3, July 2018
2
by introducing semantic indexing algorithms to process content of Social Networks [4].Improving
indexing output and choosing the proper rank algorithm will affect the quality of the social graph
and how nodes will be linked in social network.
2. MULTI-AGENT SYSTEMS
For Indexing and Ranking processes. The Concept of Multi-Agent system (MAS) is a great
addition to give good, improving, and self-learning mechanism especially in social networks.
Multi-Agent Systems are computerized system consisted of multiple agents that they
interact intelligently within the environment which can be used to solve problems [5].
The agent precepts data and grabs the documents from the environment (SSN) and using the
learning elements it updates the performance elements. And it builds the knowledge base by
updating the learning elements based on critics that represents the feedback from the whole
system that the agent is working in. and this knowledge base generates problems that can be used
as condition rules to be used in the decision that the agent will make to do the needed action in
the environment.
In social networks, the multi-agent implementation theories have two main perspectives: user
perspective and network perspective.
In user perspective, the agents will be the user accounts [6], which means each account will act as
agent in mediating data and negotiating connections with the other agents to enlarge their social
network.
Nevertheless, in the network perspective, the agents will be carrying out some central operations
such as filtering data, managing connectivity, and building the social graph.
Because of the semantics in social network is being discussed, the perspective of semantics must
be an important role to be done by agents in the multi-agent implementation of social network.
In this paper, the concentration will be on some roles done by agents, which are parsing data,
building semantic index of the data, then ranking this index, and finally build connections
between contents according to the rank output.
3. SEMANTIC INDEXING - LATENT DIRICHLET ALLOCATION (LDA)
Indexing algorithms - mainly in search engines - collect, parses, analyse and store data to
facilitate quick and accurate information retrieval [7]. Index design includes interdisciplinary
concepts from linguistics, cognitive psychology, mathematics, computer science and informatics.
An alternative name for the process in the context of search engines intended for searching web
pages on the Internet is web indexing.
When dealing with information retrieval, stored documents are identified by sets of terms that are
used to represent the contents of the document. The indexing process is the assignment of the
index for documents in the collection of documents. The index of terms can be predefined as a
fixed set of controlled vocabulary or can be any additional words that the indices consider to be
related to the topic of the document.
One of the most popular indexing algorithms is Latent Dirichlet Allocation (LDA) [8], is a
generative probabilistic model for collections of discrete data such as text corpora. LDA is a
three-level hierarchical Bayesian model, in which each item of a collection is modelled as a finite
mixture over an underlying set of topics. Each topic is, in turn, modelled as an infinite mixture
over an underlying set of topic probabilities.LDA assumes that each document contains different
topics, and words in the document are generated from these topics. All documents contain a
3. International Journal of Web & Semantic Technology (IJWesT) Vol.9, No.3, July 2018
3
specific set of topics, but the proportion of each topic in each document is different. The
generative process of the LDA model can be described as follows [9]. Assuming document w in a
corpus D:
1- Choose N ~ Poisson (𝜉).
2- Choose 𝜃 ~ Dir(𝑎).
3- For each of the N words 𝑤 :
Choose a topic 𝑧 ~ multinomial distribution (𝜃)
Choose a word 𝑤 from 𝑝(𝑤 |𝑧 , 𝛽), a multinomial probability conditioned on the topic z .
Many simplifying assumptions are made in this basic model, such as removing some subsequent
sections.
First, the dimensionality k of the Dirichlet distribution which means the dimensionality of the
topic variable z is assumed known and fixed. Second, the word probabilities are parameterized by
k 𝑘 × 𝑉 matrix𝛽 where 𝛽 = 𝑝(𝑤 = 1 |𝑧 = 1) which for now is treated as a fixed quantity
that is to be estimated.Finally, the Poisson assumption is not critical to anything that follows and
more realistic document length distributions can be used as needed.
Furthermore, note that N is independent of all the other data generating variables (𝜃 and z). It is
thus an ancillary variable and its randomness will generally be ignored in the subsequent
development. A k-dimensional Dirichlet random variable 𝜃 can take values in the (k−1)-simplex
(a k-vector 𝜃 lies in the (k−1)-simplex if 𝜃 ≥ 0, ∑ 𝜃 = 1 ), and has the following probability
density on this simplex:
𝑝(𝜃|𝛼) =
(∑ )
∏ ( )
𝜃 … 𝜃 (1)
Where the parameter 𝛼 is a k-vector with components 𝛼 >0, and where Γ(𝑥) is the Gamma
function.The Dirichlet is a convenient distribution on the simplex—it is in the exponential family,
has finite dimensional sufficient statistics, and is conjugate to the multinomial distribution.Given
the parameters 𝛼 and 𝛽, the joint distribution of a topic mixture 𝜃, a set of N topics z, and a set of
N words w is given by:
𝑝(𝜃, 𝑧, 𝑤|𝛼, 𝛽) = 𝑝(𝜃|𝛼) ∏ 𝑝(𝑧 |𝜃)𝑝(𝑤 |𝑧 , 𝛽) (2)
Where 𝑝(𝑧 |𝜃) is simply 𝜃 for the unique i such that𝑧 = 1. Integrating over 𝜃 and summing
over z, then the marginal distribution of a document will be:
𝑝(𝑤|𝛼, 𝛽) = ∫ 𝑝(𝜃|𝛼) ∏ ∑ 𝑝(𝑧 |𝜃)𝑝(𝑤 |𝑧 , 𝛽) 𝑑𝜃 (3)
Finally, the probability (or the log-likelihood) of generating corpus is:
𝑝(𝐷|𝛼, 𝛽) = ∏ ∫ 𝑝(𝜃 |𝛼) ∏ ∑ 𝑝(𝑧 |𝜃)𝑝(𝑤 |𝑧 , 𝛽) 𝑑𝜃 (4)
4. PROPOSED MODEL
The proposed model for semantic tag ranking for social networkis based on enhanced LDA. The
input in this model is the document collection where it contains the word per document count.
Then the final output will be the ranking of the tags, which are the Tag Rank results of the topics
index.The proposed model is based on two main phases: the indexing phase which is carried out
by the indexing agent, and the ranking phase which is carried out by the ranking agent.In
indexing phase, the input is the document collection where it contains word and document count.
In this phase, the initialization is done then document parsed to get the initial index to be
4. International Journal of Web & Semantic Technology (IJWesT) Vol.9, No.3, July 2018
4
processed by (LDA) algorithm. The output of this phase is the semantic index, which contains
word-per-topic distribution and topic-per-document distribution.
In this proposed model, the focus on the topic-per-document to be processed as tags. Therefore, in
the next phase, which is the ranking phase the input will be the topic-per-document distribution
that came as index matrix. In ranking phase, the input will be processed by Tag Rank algorithm
with the help. The final output will the Tag ranking matrix that will be sent to build the social
links in the semantic social network. Figure1. Shows the proposed model architecture.
Figure1: System Architecture.
The indexing phase has seven sequential steps to build the topic per document index document
based on the document collection to be processed. Figure2. Shows the steps of this phase:
5. International Journal of Web & Semantic Technology (IJWesT) Vol.9, No.3, July 2018
5
Figure2: Flowchart of Indexing Phase.
The start is with the input of document collection, which is parsed then indexed with choosing the
optimal parameters (𝑎, 𝛽𝑎𝑛𝑑𝐾) which increases the precision and recall of the output.
Then the output will be probability of topic per document that will be filtered by specific
threshold (𝜏) that will be chosen by experiment in the simulation. The final output will be the
filtered (𝜃)which is the output of the enhanced LDA algorithm. Which is called E-LDA.
The next phase is ranking phase. It starts with the output of E-LDA algorithm with checking that
(𝜃) is higher than the threshold (𝜏). Then the Tag Rank algorithm start to rank (𝜃) as initial tag
rank. The ranking algorithm is simply here to maximize the rank. Each document will get the
6. International Journal of Web & Semantic Technology (IJWesT) Vol.9, No.3, July 2018
6
higher topic ranking to be the first tag. In addition, documents will be descendingranked for each
tag i.e. for each topic. Figure3. Shows the steps of ranking phase.
Figure3: Flowchart of Ranking Phase
As shown in these flowcharts it is obvious that there are two main intelligent agents that are
carrying out the system functions. Indexing and ranking agents. The next pseudocode shows the
steps of the algorithm of the semantic tag ranking.
7. International Journal of Web & Semantic Technology (IJWesT) Vol.9, No.3, July 2018
7
Algorithm 1. Intelligent Semantic Tag Ranking
Input: Document Collection
Start
//Indexing Agent{
Rule 1: Get Document
Rule 2: Parse Document Content
for i=1to n do //n= number of document records
Rule 3: Start LDA Indexing Algorithm
end for
Rule 4: Filter
{
for i=1to n do //n= number of document records
Select 𝜽𝒕𝒊
𝒘𝒉𝒆𝒓𝒆𝜽𝒕𝒊
> 𝝉//𝝉 is threshold
end for
}
Output Index (𝜽𝒕𝟏
,𝜽𝒕𝟐
,… , 𝜽𝒕𝒏
)
end } //end of indexing agent job
Input: Index (𝜽𝒕𝟏
, 𝜽𝒕𝟐
, … , 𝜽𝒕𝒏
)
//Ranking Agent{
Start
for i=1 to n do //n= number of tags
//repeat until all tags which have larger ranks than threshold 𝝉
Repeat{
//select document 1 and document 2 to be compared and maximized
Select Max(𝜽𝒊, 𝜽𝒊 𝟏)
Condition: While (Max(𝜽𝒊,𝜽𝒊 𝟏) ≥ 𝝉) { // 𝝉 is threshold
Select Max(𝜽𝒊, 𝜽𝒊 𝟏)
Sort (𝜽𝒊, 𝜽𝒊 𝟏)
}
i=i+1
} // until (all tags which are larger than 𝝉 are processed).
for j=1 to k do //k= number of documents
//repeat until all documents which have larger ranks than threshold 𝝉
Repeat{
//select tag 1 and tag 2 which are columns and rows of Max(𝜽𝒕𝒋
, 𝜽𝒕𝒋 𝟏
)
Select Max(𝜽𝒕𝒋
, 𝜽𝒕𝒋 𝟏
)
Condition: While (Max(𝜽𝒕𝒋
,𝜽𝒕𝒋 𝟏
) ≥ 𝝉) { //𝝉 is threshold
Select Max(𝜽𝒘𝒋
, 𝜽𝒘𝒋 𝟏
)
Sort (𝜽𝒘𝒋
, 𝜽𝒘𝒋 𝟏
)
}
j=j+1
} // until (all tags which are larger than 𝝉 are processed).
Build Links between Tags
Output Tag Rank records
end } //end of Ranking Agent job.
5. SIMULATION RESULTS
This section presents the simulation experiment for the proposed model. The concept of
combining index resulting from LDA with threshold applied to be as the Tag input for the ranking
algorithm has to be proven by results and providing a good comparison between the proposed
model phases, in both indexing and ranking phases
Simulation was carried out using MATLAB R2016a simulation software under Microsoft
Windows 10 operating system.
8. International Journal of Web & Semantic Technology (IJWesT) Vol.9, No.3, July 2018
8
The hardware platform that carried out the software is Intel core i7-3520M processor with 8
Gigabyte random access memory.
The simulation on the indexing phase will be carried out based on previous simulation works
done by The Natural Language Processing Group at Stanford University [10], also on natural
language labs on Iowa State University [11], and the research toolbox from University of
California, Irvine [12], using their MATLAB functions to implement the enhanced LDA function.
The dataset is used was psychreview dataset. Which contains Psychology Review Abstracts and
collocation Data. This dataset contains about 85000 records of words and documents. With the
initial count of words for each document and the topic.
To evaluate the simulation, main four metrics were introduced; two for evaluating indexing which
are precision and recall. The other two is for evaluating ranking and these metrics are mean
average precision (MAP) and Normalized Discounted Cumulative Gain (NDCG) [13].
Precision: is the ratio of the number of relevant documents retrieved to the total number of
irrelevant and relevant documents retrieved:
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
∩
(5)
Recall: is the ratio of the number of relevant documents retrieved to the total number of relevant
documents in the dataset:
𝑅𝑒𝑐𝑎𝑙𝑙 =
∩
(6)
Mean Average Precision (MAP): is the precision-at-k score of a ranking y, averaged over all the
positions k of relevant documents:
𝑀𝐴𝑃 =
∑ ( )
(7)
Where:
𝐴𝑣𝑒𝑃 = 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑃𝑟𝑒𝑐𝑖𝑠𝑜𝑛 =
∑ ( ( ) × ( ))
(8)
Q is the number of queries, and:
𝑟𝑒𝑙(𝑘) = 1, 𝑤ℎ𝑒𝑛 𝑖𝑡𝑒𝑚 𝑎𝑡 𝑟𝑎𝑛𝑘 (𝑘)𝑖𝑠 𝑟𝑒𝑙𝑒𝑣𝑎𝑛𝑡
0
(9)
Normalized Discounted Cumulative Gain (NDCG): is a normalization of the Discounted
Cumulative Gain (DCG) where (DCG) is a weighted sum of the relevancy degree of the ranked
items:
𝑁𝐷𝐶𝐺 = (10)
Where: 𝐷𝐶𝐺 = 𝑟𝑒𝑙 + ∑ . (11)
And IDCG is the ideal DCG at position p
Based on previous research done earlier [14], the chosen the optimal parameters for LDA
algorithm(𝛼,𝛽, and k), were 𝑘 = 4, 𝛼 =
.
, And 𝛽 = 0.1.
The next enhancement is to choose the best threshold (𝜏) to filter the output of the indexing
process. Therefore, for the index output with the parameters that have been chosen before,
9. International Journal of Web & Semantic Technology (IJWesT) Vol.9, No.3, July 2018
9
calculating the precision and recall of the output with applying the filter. The result was as shown
in Figure4.
Figure 4. Precision and Recall according to (𝜏)
It was noticed that the best combination of precision and recall is around 𝜏=0.5. And so it is a
good suggestion to choose this value as the threshold of the filter. The resulting algorithm with
these enhancement is called “Enhanced LDA” abbreviated (E-LDA). Figure5. Shows how topic
distribution is enhanced using this filter:
Figure5. Topic Distribution in Document Collection according to the Filter.
The simulation for the indexing agent was carried out based on previous researches comparing
indexing algorithms [9] [15]. Figure6 shows a comparison between (E-LDA) and these
algorithms.
10. International Journal of Web & Semantic Technology (IJWesT) Vol.9, No.3, July 2018
10
Figure 6. (E-LDA) vs semantic indexing algorithms
As shown in Figure. E-LDA is enhanced from LDA with (4%). E-LDA has better precision vs.
recall combination which means better relevancy in index output.
After indexing phase enhancement done,it is possible to combine tag rank with the output to get
the semantic tag rank. Figure7. Shows the improvement in precision and recall between E-LDA
and the Tag Rank.
Figure 7.Tag Rank vs. (E-LDA).
As shown in Figure7. Tag Rank shows better precision and recall than input from E-LDA with
almost (5%).
Comparing Tag Rank with Page Rank (PR), Weighted Page Rank (WPR), Hyper-link Induced
Topic Search (HITS) and Time Rank (TSPR) [16] [17] [18] [19]. And according to MAP and
NDCG@(k=4) as (k=4) is the best parameter for the indexing algorithm LDA that was concluded
earlier [14]. Figure8. Shows the comparison between ranking algorithms:
11. International Journal of Web & Semantic Technology (IJWesT) Vol.9, No.3, July 2018
11
Figure 8. Semantic Tag Rank vs. Ranking Algorithms
As shown in Figure8. Tag Rank shows the best MAP and NDCG values and so it could be said
that Tag Rank is the best suitable ranking algorithm for this proposed model.
6. CONCLUSION AND FUTURE WORK
In this paper, the main aim is to provide new model of Social Network that is based on Multi-
Agent Systems concept and the concept of semantic social network. This proposed model mainly
consisted of two main agents: indexing agent that carries out enhanced Latent Dirichlet
Allocation algorithm (E-LDA), and ranking agent that carries out Tag Rank algorithm.Enhanced
LDA (E-LDA) is distinguished from other preceding indexing algorithms and simulation results
show an increase precision and recall using E-LDA.E-LDA is enhanced from LDA with (4%),
and shows better performance than other semantic indexing algorithms.
Semantic Tag Rank is also distinguished from other ranking agents as it deals with tags that is
more relevant to social networks and also more relevant to semantics.
In the future, the term per topic index is suggested to be entered as tags to be processed by
ranking agent. This means that we will have larger data to be ranked. So the processing
conditions must be taken care of while implementing the system.
A new model of social networks depending on semantics is proposed, with using semantic
indexing methods and rank algorithms. In addition, show in test how this idea will be
implemented. Then building and implementing the proposed model to a semantic social network
can be suggested. Either in an existing social network, or in new semantic social network
programmed from the beginning based on the proposed model in this paper.
REFERENCES
[1] Obar, Jonathan A., Wildman, Steve. Social media definition and the governance challenge: An
introduction to the special issue. Telecommunications policy. 39 (9): 745–750.
doi:10.1016/j.telpol.2015.
[2] Boyd, Dana. Ellison, Nicole. Social Network Sites: Definition, History, and Scholarship, Michigan
State University, (2007).
[3] Boyd, Dana; Crawford, Kate. Six Provocations for Big Data. Social Science Research Network: A
Decade in Internet Time: Symposium on the Dynamics of the Internet and Society. (September 21,
2011). doi:10.2139/ssrn.1926431.
[4] tephen Downes. The Semantic Social Network. February 14, 2004.
[5] Wooldridge, Michael. An Introduction to MultiAgent Systems. John Wiley & Sons. (2002) p. 366.
ISBN 0-471-49691-X.
12. International Journal of Web & Semantic Technology (IJWesT) Vol.9, No.3, July 2018
12
[6] Franchi, Enrico , “A Multi-Agent Implementation of Social Networks”, Proceedings of the 11th
WOA 2010 Workshop, DagliOggettiAgliAgenti, Rimini, Italy, September 5-7, 2010.
[7] Christopher D. Manning, PrabhakarRaghavan and HinrichSchütze, “Introduction to Information
Retrieval”. Cambridge University Press. 2008.
[8] Blei, David M.; Andrew Y. Ng; Michael I. Jordan. “Latent Dirichlet Allocation”. Journal of Machine
Learning Research. 3: 993–1022. doi:10.1162/jmlr. 2003.3.4-5.993.2003.
[9] Wang, Y., Lee, J.-S. and Choi, I.-C. “Indexing by Latent Dirichlet Allocation and an Ensemble
Model”. Journal of the Association for Information Science and Technology, 67: 1736–1750.
doi:10.1002/asi.23444. 2016.
[10] The Natural Language Processing Group at Stanford University, https://nlp.stanford.edu/, accessed in
October 1, 2017.
[11] Iowa State University, http://home.eng.iastate.edu, accessed in October 10, 2017.
[12] Matlab Topic Modeling Research Toolbox in University of California, Irvine,
http://psiexp.ss.uci.edu/research/programs_data/toolbox.htm, accessed in October 12, 2017.
[13] Järvelin, Kalervo and JaanaKekäläinen. “IR evaluation methods for retrieving highly relevant
documents.” SIGIR Forum 51 (2000): 243-250.
[14] R. Hamamreh and S. Awad, "Tag Ranking Multi-Agent Semantic Social Networks ," 2017
International Conference on Computational Science and Computational Intelligence (CSCI), Las
Vegas, NV, 2017.
[15] Choi, In-Chan & Lee, Jaesung. “Document Indexing by Latent Dirichlet Allocation”. In proceedings
of the 2010 international conference on data mining, At Los Angeles, 2010.
[16] L. Page, S. Brin, R. Motwani, and T. Winograd, “The PageRank Citation Ranking: Bringing Order to
the Web”, Technical Report, Stanford Digital Libraries SIDL-WP-1999-0120, 1999.
[17] Wenpu Xing and Ali Ghorbani, “Weighted PageRank Algorithm”, In proceedings of the 2rd Annual
Conference on Communication Networks & Services Research, PP. 305-314, 2004.
[18] Jon Kleinberg, “Authoritative Sources in a Hyperlinked Environment”, In Proceedings of the ACM-
SIAM Symposium on Discrete Algorithms, 1998.
[19] H Jiang et al., "TIMERANK: A Method of Improving Ranking Scores by Visited Time", In
proceedings of the Seventh International
AUTHORS
Rushdi A. Hamamreh has PH.D. in Distributed Systems and Networks Security, He
graduated at the Saint Petersburg State Technical University in 2002; He is Associate
Professor and Head of Computer Engineering at Al-Quds University. His research interests
include Networks Security, Routing Protocols, Multiagent Systems and, Cloud and Mobile
Computing.
Email id : rushdi@staff.alquds.edu
SamehAwad graduated in Computer Engineering in 2008 from Al-Quds University. Since
that he has been working in the Department of Information Technology in Birzeit
University. In 2018 he has completed a MSc in Electronics and Computer Engineering
from Al-Quds University.
Email id : sfawad@birzeit.edu