This document discusses an efficient filteration system for unwanted messages on social networking sites. It proposes a Trust Evaluation System (TES) that uses a reputation metric to evaluate new messages submitted by users and assign a confidence level based on the trustworthiness of the reporter. TES rewards reporters whose feedback agrees with highly trusted users and penalizes those who disagree. It also continuously updates the confidence level of messages based on additional feedback. The system aims to induct a community of trusted reporters and automatically filter future messages matching fingerprints that have been cataloged as spam.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Pharmaceutical Science Invention (IJPSI) is an international journal intended for professionals and researchers in all fields of Pahrmaceutical Science. IJPSI publishes research articles and reviews within the whole field Pharmacy and Pharmaceutical Science, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Filtering Unwanted Messages from Online Social Networks (OSN) using Rule Base...IOSR Journals
Online Social Networks (OSNs) are today one of the most popular interactive medium to share,
communicate, and distribute a significant amount of human life information. In OSNs, information filtering can
also be used for a different, more responsive, function. This is owing to the fact that in OSNs there is the
possibility of posting or commenting other posts on particular public/private regions, called in general walls.
Information filtering can therefore be used to give users the ability to automatically control the messages
written on their own walls, by filtering out unwanted messages. OSNs provide very little support to prevent
unwanted messages on user walls. For instance, Facebook permits users to state who is allowed to insert
messages in their walls (i.e., friends, defined groups of friends or friends of friends). Though, no content-based
partialities are preserved and therefore it is not possible to prevent undesired communications, for instance
political or offensive ones, no matter of the user who posts them. To propose and experimentally evaluate an
automated system, called Filtered Wall (FW), able to filter unwanted messages from OSN user walls
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.co¬m-Visit Our Website: www.finalyearprojects.org
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Pharmaceutical Science Invention (IJPSI) is an international journal intended for professionals and researchers in all fields of Pahrmaceutical Science. IJPSI publishes research articles and reviews within the whole field Pharmacy and Pharmaceutical Science, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Filtering Unwanted Messages from Online Social Networks (OSN) using Rule Base...IOSR Journals
Online Social Networks (OSNs) are today one of the most popular interactive medium to share,
communicate, and distribute a significant amount of human life information. In OSNs, information filtering can
also be used for a different, more responsive, function. This is owing to the fact that in OSNs there is the
possibility of posting or commenting other posts on particular public/private regions, called in general walls.
Information filtering can therefore be used to give users the ability to automatically control the messages
written on their own walls, by filtering out unwanted messages. OSNs provide very little support to prevent
unwanted messages on user walls. For instance, Facebook permits users to state who is allowed to insert
messages in their walls (i.e., friends, defined groups of friends or friends of friends). Though, no content-based
partialities are preserved and therefore it is not possible to prevent undesired communications, for instance
political or offensive ones, no matter of the user who posts them. To propose and experimentally evaluate an
automated system, called Filtered Wall (FW), able to filter unwanted messages from OSN user walls
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09849539085, 09966235788 or mail us - ieeefinalsemprojects@gmail.co¬m-Visit Our Website: www.finalyearprojects.org
Content Based Message Filtering For OSNS Using Machine Learning ClassifierIJMER
Online social networking(OSNs) sites like Twitter, Orkut, YouTube, and Face book are among
the most popular sites on the Internet. Users of these web sites forms a social network, which provides a
powerful means of sharing, organizing, and finding useful information .Unlike web information , the
Online social networks (OSN) are organized around more number of users joins the network, shares their
information and create the links to communicate with other online users. The resulting social network
sites provides a basis for maintaining social relationships, for finding users with similar interests, and for
locating content and knowledge that has been contributed or endorsed by other users. In OSNs
information filtering can be used for avoiding the unwanted messages sharing or commenting on the user
Walls. In this paper, we have proposed a system to filter undesired messages from OSN walls. The system
exploits a machine learning soft classifier to enforce customizable content-dependent FRs. Moreover, the
flexibility of the proposed system in terms of filtering options is enhanced through the management of
BLs.
Filtered wall is a system to filter undesired messages from OSN walls.
This system approach decides when user should be inserted into a black list.
Filtered wall has a wide variety of applications in OSN wall
Rule based messege filtering and blacklist management for online social networkeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Semantic Massage Addressing based on Social Cloud Actor's InterestsCSCJournals
Wireless communication with Mobile Terminals has become popular tools for collecting and sending information and data. With mobile communication comes the Short Message Service (SMS) technology which is an ideal way to stay connected with anyone, anywhere anytime to help maintain business relationships with customers. Sending individual SMS messages to long list of mobile numbers can be very time consuming, and face problems of wireless communications such as variable and asymmetric bandwidth, geographical mobility and high usage costs and face the rigidity of lists. This paper proposes a technique that assures sending the message to semantically specified group of recipients. A recipient group is automatically identified based on personal information (interests, work place, publications, social relationships, etc.) and behavior based on a populated ontology created by integrating the publicly available FOAF (Friend-of-a-Friend) documents. We demonstrate that our simple technique can first, ensure extracting groups effectively according to the descriptive attributes and second send SMS effectively and can help combat unintentional spam and preserve the privacy of mobile numbers and even individual identities. The technique provides fast, effective, and dynamic solution to save time in constructing lists and sending group messages which can be applied both on personal level or in business.
Filter unwanted messages from walls and blocking nonlegitimate user in osnIJSRD
Today’s life is totally based on Internet. Now a days people cannot imagine life without Internet. Information and communication technology plays vital role in today’s online networked society. In today’s life, we are very close to the online social networks. Online social networks are used for posting and sharing information across various social networking sites. But user’s privacy is not maintained by online social networks. For maintaining users sensitive information’s privacy online social networks provides little or no support. For filtering unwanted messages we propose a system using machine learning (ML). Using machine learning in soft classifier content based filtering performed. In proposed system filtering rules (FR’s) are provided for content independent filtering.. Blacklists are used for more flexibility by which filtering choices are increased. Proposed system provides security to the Online Social Networks.
A web service is defined as a software system designed to support interoperable machine-to-machine interaction over a network. Put in another way, Web services provide a framework for system integration, independent of programming language and operating system. Web services are widely deployed in current distributed systems and have become the technology of choice. The suitability of Web services for integrating heterogeneous systems is largely facilitated through its extensive use of the Extensible Markup Language (XML). Thus, the security of a Web services based system depends not only on the security of the services themselves, but also on the confidentiality and integrity of the XML based SOAP messages used for communication. Recently, Web services have generated great interests in both vendors and researchers. A web service, based on existing Internet protocols and open standards, and provides a flexible solution to the problem of application integration. This paper provides an overview of the web services, web service security and the various algorithms used for encryption of the SOAP messages.
A Mobile Messaging Apps: Service Usage Classification to Internet Traffic and...dbpublications
The rapid adoption of mobile
messaging Apps has enabled us to collect massive
amount of encrypted Internet traffic of mobile
messaging. The classification of this traffic into
different types of in-App service usages can help for
intelligent network management, such as managing
network bandwidth budget and providing quality of
services. Traditional approaches for classification of
Internet traffic rely on packet inspection, such as
parsing HTTP headers. However, messaging Apps are
increasingly using secure protocols, such as HTTPS
and SSL, to transmit data. This imposes significant
challenges on the performances of service usage
classification by packet inspection. To this end, in this
paper, we investigate how to exploit encrypted Internet
traffic for classifying in-App usages. Specifically, we
develop a system, named CUMMA, for classifying
service usages of mobile messaging Apps by jointly
modeling user behavioral patterns, network traffic
characteristics, and temporal dependencies. Along this
line, we first segment Internet traffic from trafficflows
into sessions with a number of dialogs in a
hierarchical way. Also, we extract the discriminative
features of traffic data from two perspectives: (i)
packet length and (ii) time delay. Next, we teach a
service usage predictor to classify these segmented
dialogs into single-type usages or outliers. In addition,
we design a clustering Hidden Markov Model (HMM)
based method to detect mixed dialogs from outliers
and decompose mixed dialogs into sub-dialogs of
single-type usage. Indeed, CUMMA enables mobile
analysts to identify service usages and analyze enduser
in-App behaviors even for encrypted Internet
traffic.
Smart detection of offensive words in social media using the soundex algorith...IJECEIAES
Offensive posts in the social media that are inappropriate for a specific age, level of maturity, or impression are quite often destined more to unadult than adult participants. Nowadays, the growth in the number of the masked offensive words in the social media is one of the ethically challenging problems. Thus, there has been growing interest in development of methods that can automatically detect posts with such words. This study aimed at developing a method that can detect the masked offensive words in which partial alteration of the word may trick the conventional monitoring systems when being posted on social media. The proposed method progresses in a series of phases that can be broken down into a pre-processing phase, which includes filtering, tokenization, and stemming; offensive word extraction phase, which relies on using the soundex algorithm and permuterm index; and a post-processing phase that classifies the users’ posts in order to highlight the offensive content. Accordingly, the method detects the masked offensive words in the written text, thus forbidding certain types of offensive words from being published. Results of evaluation of performance of the proposed method indicate a 99% accuracy of detection of offensive words.
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approachijma
A large amount of data is present on the web. It contains huge number of web pages and to find suitable
information from them is very cumbersome task. There is need to organize data in formal manner so that
user can easily access and use them. To retrieve information from documents, there are many Information
Retrieval (IR) techniques. Current IR techniques are not so advanced that they can be able to exploit
semantic knowledge within documents and give precise results. IR technology is major factor responsible
for handling annotations in Semantic Web (SW) languages. With the rate of growth of web and huge
amount of information available on the web which may be in unstructured, semi structured or structured
form, it has become increasingly difficult to identify the relevant pieces of information on the internet. IR
technology is major factor responsible for handling annotations in Semantic Web (SW) languages.
Knowledgeable representation languages are used for retrieving information. So, there is need to build an
ontology that uses well defined methodology and process of developing ontology is called Ontology
Development. Secondly, Cloud computing and data mining have become famous phenomena in the current
application of information technology. With the changing trends and emerging of the new concept in the
information technology sector, data mining and knowledge discovery have proved to be of significant
importance. Data mining can be defined as the process of extracting data or information from a database
which is not explicitly defined by the database and can be used to come up with generalized conclusions
based on the trends obtained from the data. A database may be described as a collection of formerly
structured data. Multi agents data mining may be defined as the use of various agents cooperatively
interact with the environment to achieve a specified objective. Multi agents will always act on behalf of
users and will coordinate, cooperate, negotiate and exchange data with each other. An agent would
basically refer to a software agent, a robot or a human being Knowledge discovery can be defined as the
process of critically searching large collections of data with the aim of coming up with patterns that can be
used to make generalized conclusions. These patterns are sometimes referred to as knowledge about the
data. Cloud computing can be defined as the delivery of computing services in which shared resources,
information and software’s are provided over a network, for example, the information super highway.
Cloud computing is normally provided over a web based service which hosts all the resources required. As,
the knowledge mining is used in many fields of study such as in science and medicine, finance, education,
manufacturing and commerce. In this paper, the Semantic Web addresses the first part of this challenge by
trying to make the data also machine understandable in the form of Ontology, while Multi-Agen
Classification-based Retrieval Methods to Enhance Information Discovery on th...IJMIT JOURNAL
The widespread adoption of the World-Wide Web (the Web) has created challenges both for society as a whole and for the technology used to build and maintain the Web. The ongoing struggle of information retrieval systems is to wade through this vast pile of data and satisfy users by presenting them with information that most adequately it’s their needs. On a societal level, the Web is expanding faster than we can comprehend its implications or develop rules for its use. The ubiquitous use of the Web has raised important social concerns in the areas of privacy, censorship, and access to information. On a technical level, the novelty of the Web and the pace of its growth have created challenges not only in the development of new applications that realize the power of the Web, but also in the technology needed to scale applications to accommodate the resulting large data sets and heavy loads. This thesis presents searching algorithms and hierarchical classification techniques for increasing a search service's understanding of web queries. Existing search services rely solely on a query's occurrence in the document collection to locate relevant documents. They typically do not perform any task or topic-based analysis of queries using other available resources, and do not leverage changes in user query patterns over time. Provided within are a set of techniques and metrics for performing temporal analysis on query logs. Our log analyses are shown to be reasonable and informative, and can be used to detect changing trends and patterns in the query stream, thus providing valuable data to a search service.
Social network has become so popular with overwhelming high rate of growth, due to this popularity the online social networks is facing the issues of spamming, which has leads to unsubstantial economic loss to this menace of spam and spammers activities. It has leads to uncontrollable dissemination of viruses and malwares, promotional ads, phishing, and scams. spam activities has enter a new dangerous dimension, the spammers have step up their games and tactics online social networks, it consumes large amounts of network bandwidth leading to less revenue and significant economic loss to both private and public sectors. From the previous scholars work on spammer classification taxonomy, various machine learning techniques have been extensively used to detect spam activities and spammers in online social networks. There are various classifier that are learn over content-based features extracted from the user's interactions and profiles to label them as spam/spammers or legitimate. But recently, new network structural bench mark features have been proposed for spammer detection task, but their importance using structural bench mark learning methods has not been extensively evaluated yet. In this research work, we evaluate the the metric performance of some structural bench mark learning methods using scientific and strategic approach based attributes extracted from an interaction network for the task of spammer detection in online social network.
The size of the Internet enlarging as per to grow the users of search providers continually demand search
results that are accurate to their wishes. Personalized Search is one of the options available to users in
order to sculpt search results based on their personal data returned to them provided to the search
provider. This brings up fears of privacy issues however, as users are typically anxious to revealing
personal info to an often faceless service provider along the Internet. This work proposes to administer
with the privacy issues surrounding personalized search and discusses ways that privacy can be improved
so that users can get easier with the dismissal of their personal information in order to obtain more precise
search results.
Context Driven Technique for Document ClassificationIDES Editor
In this paper we present an innovative hybrid Text
Classification (TC) system that bridges the gap between
statistical and context based techniques. Our algorithm
harnesses contextual information at two stages. First it extracts
a cohesive set of keywords for each category by using lexical
references, implicit context as derived from LSA and wordvicinity
driven semantics. And secondly, each document is
represented by a set of context rich features whose values are
derived by considering both lexical cohesion as well as the extent
of coverage of salient concepts via lexical chaining. After
keywords are extracted, a subset of the input documents is
apportioned as training set. Its members are assigned categories
based on their keyword representation. These labeled
documents are used to train binary SVM classifiers, one for
each category. The remaining documents are supplied to the
trained classifiers in the form of their context-enhanced feature
vectors. Each document is finally ascribed its appropriate
category by an SVM classifier.
Authorization mechanism for multiparty data sharing in social networkeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Filter unwanted messages from walls and blocking nonlegitimate user in osnIJSRD
Today’s life is totally based on Internet. Now a days people cannot imagine life without Internet. Information and communication technology plays vital role in today’s online networked society. In today’s life, we are very close to the online social networks. Online social networks are used for posting and sharing information across various social networking sites. But user’s privacy is not maintained by online social networks. For maintaining users sensitive information’s privacy online social networks provides little or no support. For filtering unwanted messages we propose a system using machine learning (ML). Using machine learning in soft classifier content based filtering performed. In proposed system filtering rules (FR’s) are provided for content independent filtering.. Blacklists are used for more flexibility by which filtering choices are increased. Proposed system provides security to the Online Social Networks.
Content Based Message Filtering For OSNS Using Machine Learning ClassifierIJMER
Online social networking(OSNs) sites like Twitter, Orkut, YouTube, and Face book are among
the most popular sites on the Internet. Users of these web sites forms a social network, which provides a
powerful means of sharing, organizing, and finding useful information .Unlike web information , the
Online social networks (OSN) are organized around more number of users joins the network, shares their
information and create the links to communicate with other online users. The resulting social network
sites provides a basis for maintaining social relationships, for finding users with similar interests, and for
locating content and knowledge that has been contributed or endorsed by other users. In OSNs
information filtering can be used for avoiding the unwanted messages sharing or commenting on the user
Walls. In this paper, we have proposed a system to filter undesired messages from OSN walls. The system
exploits a machine learning soft classifier to enforce customizable content-dependent FRs. Moreover, the
flexibility of the proposed system in terms of filtering options is enhanced through the management of
BLs.
Filtered wall is a system to filter undesired messages from OSN walls.
This system approach decides when user should be inserted into a black list.
Filtered wall has a wide variety of applications in OSN wall
Rule based messege filtering and blacklist management for online social networkeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Semantic Massage Addressing based on Social Cloud Actor's InterestsCSCJournals
Wireless communication with Mobile Terminals has become popular tools for collecting and sending information and data. With mobile communication comes the Short Message Service (SMS) technology which is an ideal way to stay connected with anyone, anywhere anytime to help maintain business relationships with customers. Sending individual SMS messages to long list of mobile numbers can be very time consuming, and face problems of wireless communications such as variable and asymmetric bandwidth, geographical mobility and high usage costs and face the rigidity of lists. This paper proposes a technique that assures sending the message to semantically specified group of recipients. A recipient group is automatically identified based on personal information (interests, work place, publications, social relationships, etc.) and behavior based on a populated ontology created by integrating the publicly available FOAF (Friend-of-a-Friend) documents. We demonstrate that our simple technique can first, ensure extracting groups effectively according to the descriptive attributes and second send SMS effectively and can help combat unintentional spam and preserve the privacy of mobile numbers and even individual identities. The technique provides fast, effective, and dynamic solution to save time in constructing lists and sending group messages which can be applied both on personal level or in business.
Filter unwanted messages from walls and blocking nonlegitimate user in osnIJSRD
Today’s life is totally based on Internet. Now a days people cannot imagine life without Internet. Information and communication technology plays vital role in today’s online networked society. In today’s life, we are very close to the online social networks. Online social networks are used for posting and sharing information across various social networking sites. But user’s privacy is not maintained by online social networks. For maintaining users sensitive information’s privacy online social networks provides little or no support. For filtering unwanted messages we propose a system using machine learning (ML). Using machine learning in soft classifier content based filtering performed. In proposed system filtering rules (FR’s) are provided for content independent filtering.. Blacklists are used for more flexibility by which filtering choices are increased. Proposed system provides security to the Online Social Networks.
A web service is defined as a software system designed to support interoperable machine-to-machine interaction over a network. Put in another way, Web services provide a framework for system integration, independent of programming language and operating system. Web services are widely deployed in current distributed systems and have become the technology of choice. The suitability of Web services for integrating heterogeneous systems is largely facilitated through its extensive use of the Extensible Markup Language (XML). Thus, the security of a Web services based system depends not only on the security of the services themselves, but also on the confidentiality and integrity of the XML based SOAP messages used for communication. Recently, Web services have generated great interests in both vendors and researchers. A web service, based on existing Internet protocols and open standards, and provides a flexible solution to the problem of application integration. This paper provides an overview of the web services, web service security and the various algorithms used for encryption of the SOAP messages.
A Mobile Messaging Apps: Service Usage Classification to Internet Traffic and...dbpublications
The rapid adoption of mobile
messaging Apps has enabled us to collect massive
amount of encrypted Internet traffic of mobile
messaging. The classification of this traffic into
different types of in-App service usages can help for
intelligent network management, such as managing
network bandwidth budget and providing quality of
services. Traditional approaches for classification of
Internet traffic rely on packet inspection, such as
parsing HTTP headers. However, messaging Apps are
increasingly using secure protocols, such as HTTPS
and SSL, to transmit data. This imposes significant
challenges on the performances of service usage
classification by packet inspection. To this end, in this
paper, we investigate how to exploit encrypted Internet
traffic for classifying in-App usages. Specifically, we
develop a system, named CUMMA, for classifying
service usages of mobile messaging Apps by jointly
modeling user behavioral patterns, network traffic
characteristics, and temporal dependencies. Along this
line, we first segment Internet traffic from trafficflows
into sessions with a number of dialogs in a
hierarchical way. Also, we extract the discriminative
features of traffic data from two perspectives: (i)
packet length and (ii) time delay. Next, we teach a
service usage predictor to classify these segmented
dialogs into single-type usages or outliers. In addition,
we design a clustering Hidden Markov Model (HMM)
based method to detect mixed dialogs from outliers
and decompose mixed dialogs into sub-dialogs of
single-type usage. Indeed, CUMMA enables mobile
analysts to identify service usages and analyze enduser
in-App behaviors even for encrypted Internet
traffic.
Smart detection of offensive words in social media using the soundex algorith...IJECEIAES
Offensive posts in the social media that are inappropriate for a specific age, level of maturity, or impression are quite often destined more to unadult than adult participants. Nowadays, the growth in the number of the masked offensive words in the social media is one of the ethically challenging problems. Thus, there has been growing interest in development of methods that can automatically detect posts with such words. This study aimed at developing a method that can detect the masked offensive words in which partial alteration of the word may trick the conventional monitoring systems when being posted on social media. The proposed method progresses in a series of phases that can be broken down into a pre-processing phase, which includes filtering, tokenization, and stemming; offensive word extraction phase, which relies on using the soundex algorithm and permuterm index; and a post-processing phase that classifies the users’ posts in order to highlight the offensive content. Accordingly, the method detects the masked offensive words in the written text, thus forbidding certain types of offensive words from being published. Results of evaluation of performance of the proposed method indicate a 99% accuracy of detection of offensive words.
Mining in Ontology with Multi Agent System in Semantic Web : A Novel Approachijma
A large amount of data is present on the web. It contains huge number of web pages and to find suitable
information from them is very cumbersome task. There is need to organize data in formal manner so that
user can easily access and use them. To retrieve information from documents, there are many Information
Retrieval (IR) techniques. Current IR techniques are not so advanced that they can be able to exploit
semantic knowledge within documents and give precise results. IR technology is major factor responsible
for handling annotations in Semantic Web (SW) languages. With the rate of growth of web and huge
amount of information available on the web which may be in unstructured, semi structured or structured
form, it has become increasingly difficult to identify the relevant pieces of information on the internet. IR
technology is major factor responsible for handling annotations in Semantic Web (SW) languages.
Knowledgeable representation languages are used for retrieving information. So, there is need to build an
ontology that uses well defined methodology and process of developing ontology is called Ontology
Development. Secondly, Cloud computing and data mining have become famous phenomena in the current
application of information technology. With the changing trends and emerging of the new concept in the
information technology sector, data mining and knowledge discovery have proved to be of significant
importance. Data mining can be defined as the process of extracting data or information from a database
which is not explicitly defined by the database and can be used to come up with generalized conclusions
based on the trends obtained from the data. A database may be described as a collection of formerly
structured data. Multi agents data mining may be defined as the use of various agents cooperatively
interact with the environment to achieve a specified objective. Multi agents will always act on behalf of
users and will coordinate, cooperate, negotiate and exchange data with each other. An agent would
basically refer to a software agent, a robot or a human being Knowledge discovery can be defined as the
process of critically searching large collections of data with the aim of coming up with patterns that can be
used to make generalized conclusions. These patterns are sometimes referred to as knowledge about the
data. Cloud computing can be defined as the delivery of computing services in which shared resources,
information and software’s are provided over a network, for example, the information super highway.
Cloud computing is normally provided over a web based service which hosts all the resources required. As,
the knowledge mining is used in many fields of study such as in science and medicine, finance, education,
manufacturing and commerce. In this paper, the Semantic Web addresses the first part of this challenge by
trying to make the data also machine understandable in the form of Ontology, while Multi-Agen
Classification-based Retrieval Methods to Enhance Information Discovery on th...IJMIT JOURNAL
The widespread adoption of the World-Wide Web (the Web) has created challenges both for society as a whole and for the technology used to build and maintain the Web. The ongoing struggle of information retrieval systems is to wade through this vast pile of data and satisfy users by presenting them with information that most adequately it’s their needs. On a societal level, the Web is expanding faster than we can comprehend its implications or develop rules for its use. The ubiquitous use of the Web has raised important social concerns in the areas of privacy, censorship, and access to information. On a technical level, the novelty of the Web and the pace of its growth have created challenges not only in the development of new applications that realize the power of the Web, but also in the technology needed to scale applications to accommodate the resulting large data sets and heavy loads. This thesis presents searching algorithms and hierarchical classification techniques for increasing a search service's understanding of web queries. Existing search services rely solely on a query's occurrence in the document collection to locate relevant documents. They typically do not perform any task or topic-based analysis of queries using other available resources, and do not leverage changes in user query patterns over time. Provided within are a set of techniques and metrics for performing temporal analysis on query logs. Our log analyses are shown to be reasonable and informative, and can be used to detect changing trends and patterns in the query stream, thus providing valuable data to a search service.
Social network has become so popular with overwhelming high rate of growth, due to this popularity the online social networks is facing the issues of spamming, which has leads to unsubstantial economic loss to this menace of spam and spammers activities. It has leads to uncontrollable dissemination of viruses and malwares, promotional ads, phishing, and scams. spam activities has enter a new dangerous dimension, the spammers have step up their games and tactics online social networks, it consumes large amounts of network bandwidth leading to less revenue and significant economic loss to both private and public sectors. From the previous scholars work on spammer classification taxonomy, various machine learning techniques have been extensively used to detect spam activities and spammers in online social networks. There are various classifier that are learn over content-based features extracted from the user's interactions and profiles to label them as spam/spammers or legitimate. But recently, new network structural bench mark features have been proposed for spammer detection task, but their importance using structural bench mark learning methods has not been extensively evaluated yet. In this research work, we evaluate the the metric performance of some structural bench mark learning methods using scientific and strategic approach based attributes extracted from an interaction network for the task of spammer detection in online social network.
The size of the Internet enlarging as per to grow the users of search providers continually demand search
results that are accurate to their wishes. Personalized Search is one of the options available to users in
order to sculpt search results based on their personal data returned to them provided to the search
provider. This brings up fears of privacy issues however, as users are typically anxious to revealing
personal info to an often faceless service provider along the Internet. This work proposes to administer
with the privacy issues surrounding personalized search and discusses ways that privacy can be improved
so that users can get easier with the dismissal of their personal information in order to obtain more precise
search results.
Context Driven Technique for Document ClassificationIDES Editor
In this paper we present an innovative hybrid Text
Classification (TC) system that bridges the gap between
statistical and context based techniques. Our algorithm
harnesses contextual information at two stages. First it extracts
a cohesive set of keywords for each category by using lexical
references, implicit context as derived from LSA and wordvicinity
driven semantics. And secondly, each document is
represented by a set of context rich features whose values are
derived by considering both lexical cohesion as well as the extent
of coverage of salient concepts via lexical chaining. After
keywords are extracted, a subset of the input documents is
apportioned as training set. Its members are assigned categories
based on their keyword representation. These labeled
documents are used to train binary SVM classifiers, one for
each category. The remaining documents are supplied to the
trained classifiers in the form of their context-enhanced feature
vectors. Each document is finally ascribed its appropriate
category by an SVM classifier.
Authorization mechanism for multiparty data sharing in social networkeSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Filter unwanted messages from walls and blocking nonlegitimate user in osnIJSRD
Today’s life is totally based on Internet. Now a days people cannot imagine life without Internet. Information and communication technology plays vital role in today’s online networked society. In today’s life, we are very close to the online social networks. Online social networks are used for posting and sharing information across various social networking sites. But user’s privacy is not maintained by online social networks. For maintaining users sensitive information’s privacy online social networks provides little or no support. For filtering unwanted messages we propose a system using machine learning (ML). Using machine learning in soft classifier content based filtering performed. In proposed system filtering rules (FR’s) are provided for content independent filtering.. Blacklists are used for more flexibility by which filtering choices are increased. Proposed system provides security to the Online Social Networks.
Building a recommendation system based on the job offers extracted from the w...IJECEIAES
Recruitment, or job search, is increasingly used throughout the world by a large population of users through various channels, such as websites, platforms, and professional networks. Given the large volume of information related to job descriptions and user profiles, it is complicated to appropriately match a user's profile with a job description, and vice versa. The job search approach has drawbacks since the job seeker needs to search a job offers in each recruitment platform, manage their accounts, and apply for the relevant job vacancies, which wastes considerable time and effort. The contribution of this research work is the construction of a recommendation system based on the job offers extracted from the web and on the e-portfolios of job seekers. After the extraction of the data, natural language processing is applied to structured data and is ready for filtering and analysis. The proposed system is a content-based system, it measures the degree of correspondence between the attributes of the e-portfolio with those of each job offer of the same list of competence specialties using the Euclidean distance, the result is classified with a decreasing way to display the most relevant to the least relevant job offers
Scraping and Clustering Techniques for the Characterization of Linkedin Profilescsandit
The socialization of the web has undertaken a new dimension after the emergence of the Online
Social Networks (OSN) concept. The fact that each Internet user becomes a potential content
creator entails managing a big amount of data. This paper explores the most popular
professional OSN: LinkedIn. A scraping technique was implemented to get around 5 Million
public profiles. The application of natural language processing techniques (NLP) to classify the
educational background and to cluster the professional background of the collected profiles led
us to provide some insights about this OSN’s users and to evaluate the relationships between
educational degrees and professional careers.
The socialization of the web has undertaken a new dimension after the emergence of the Online
Social Networks (OSN) concept. The fact that each Internet user becomes a potential content
creator entails managing a big amount of data. This paper explores the most popular
professional OSN: LinkedIn. A scraping technique was implemented to get around 5 Million
public profiles. The application of natural language processing techniques (NLP) to classify the
educational background and to cluster the professional background of the collected profiles led
us to provide some insights about this OSN’s users and to evaluate the relationships between
educational degrees and professional careers.
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)paperpublications3
Abstract: The main aim of this project is secure the user login and data sharing among the social networks like Gmail, Facebook and also find anonymous user using this networks. If the original user not available in the networks, but their friends or anonymous user knows their login details means possible to misuse their chats. In this project we have to overcome the anonymous user using the network without original user knowledge. Unauthorized user using the login to chat, share images or videos etc This is the problem to be overcome in this project .That means user first register their details with one secured question and answer. Because the anonymous user can delete their chat or data In this by using the secured questions we have to recover the unauthorized user chat history or sharing details with their IP address or MAC address. So in this project they have found out a way to prevent the anonymous users misuse the original user login details.
A Proposal on Social Tagging Systems Using Tensor Reduction and Controlling R...ijcsa
Social Tagging System is the process in which user makes their interest by tagging on a particular item. These STS are in associated with web 2.0 and has sourceful information for the users with their recommendations. It provides different types of recommendations are modeled by a 3-order tensor, on which multiway latent semantic analysis and dimensionality reduction is performed using both the Higher Order Singular Value Decomposition (HOSVD) method and the KernelSVD smoothing technique. We provide now with the 4-order tensor approach, which we named as Tensor Reduction. Here the items that are tagged can be viewed by the user who are recommended the same item and tagged over it. There by can improve the social tagging recommendations efficiency and also the unwanted request has been controlled. The results show significant improvements in terms of effectiveness.
Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Se...ijtsrd
Online Social Networks OSNs are providing a diversity of application for human users to network through families, friends and even strangers. One of such application, friend search engine, allows the universal public to inquiry individual client friend lists and has been gaining popularity recently. Proper design, this application may incorrectly disclose client private relationship information. Existing work has a privacy perpetuation clarification that can effectively boost OSNs' sociability while protecting users' friendship privacy against attacks launched by individual malicious requestors. In this project proposed an advanced collusion attack, where a victim user's friendship privacy can be compromise from side to side a series of cautiously designed queries coordinately launched by multiple malicious requestors. The result of the proposed collusion attack is validate through synthetic and real world social network data sets. The project on the advanced collusion attacks will help us design a more vigorous and securer friend search engine on OSNs in the near future. R. Brintha | H. Parveen Bagum "Retrieving Hidden Friends a Collusion Privacy Attack against Online Friend Search Engine" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-4 | Issue-4 , June 2020, URL: https://www.ijtsrd.com/papers/ijtsrd31687.pdf Paper Url :https://www.ijtsrd.com/computer-science/world-wide-web/31687/retrieving-hidden-friends-a-collusion-privacy-attack-against-online-friend-search-engine/r-brintha
A novel method for generating an elearning ontologyIJDKP
The Semantic Web provides a common framework that allows data to be shared and reused across
applications, enterprises, and community boundaries. The existing web applications need to express
semantics that can be extracted from users' navigation and content, in order to fulfill users' needs. Elearning
has specific requirements that can be satisfied through the extraction of semantics from learning
management systems (LMS) that use relational databases (RDB) as backend. In this paper, we propose
transformation rules for building owl ontology from the RDB of the open source LMS Moodle. It allows
transforming all possible cases in RDBs into ontological constructs. The proposed rules are enriched by
analyzing stored data to detect disjointness and totalness constraints in hierarchies, and calculating the
participation level of tables in n-ary relations. In addition, our technique is generic; hence it can be applied
to any RDB.
lectronic-mail is widely used most suitable method of transferring messages electronically from one
person to another, rising from and going to any part of the world. Main features of Electronic mail is its speed,
dependability, well-equipped storage options and a large number of added services make it highly well-liked
among people from all sectors of business and society. But being popular it also has negative side too. Electronics
mails are preferred media for a large number of attacks over the internet.. A number of the most popular attacks over
the internet include spams. Some methods are essentially in detection of spam related mails but they have higher false
positives. A number of filters such as Checksum-based filters, Bayesian filters, machine learning based and
memory-based filters are usually used in order to recognize spams. As spammers constantly try to find a way to
avoid existing filters, a new filters need to be developed to catch spam. This paper proposes to find an
resourceful spam mail filtering method using user profile base ontology. Ontologies permit for machineunderstandable
semantics of data. It is main to interchange information with each other for more efficient spam
filtering. Thus, it is essential to build ontology and a framework for capable email filtering. Using ontology that is
particularly designed to filter spam, bunch of useless bulk email could be filtered out on the system. We propose a
user profile-based spam filter that classifies email based on the likelihood that User profile within it have been
included in spam or valid email.
Decision Support for E-Governance: A Text Mining ApproachIJMIT JOURNAL
Information and communication technology has the capability to improve the process by which governments involve citizens in formulating public policy and public projects. Even though much of government regulations may now be in digital form (and often available online), due to their complexity and diversity, identifying the ones relevant to a particular context is a non-trivial task. Similarly, with the advent of a number of electronic online forums, social networking sites and blogs, the opportunity of gathering citizens’ petitions and stakeholders’ views on government policy and proposals has increased greatly, but the volume and the complexity of analyzing unstructured data makes this difficult. On the other hand, text mining has come a long way from simple keyword search, and matured into a discipline capable of dealing with much more complex tasks. In this paper we discuss how text-mining techniques can help in retrieval of information and relationships from textual data sources, thereby assisting policy makers in discovering associations between policies and citizens’ opinions expressed in electronic public forums and blogs etc. We also present here, an integrated text mining based architecture for e-governance decision support along with a discussion on the Indian scenario.
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
Presented at NUS: Fuzzing and Software Security Summer School 2024
This keynote talks about the democratization of fuzzing at scale, highlighting the collaboration between open source communities, academia, and industry to advance the field of fuzzing. It delves into the history of fuzzing, the development of scalable fuzzing platforms, and the empowerment of community-driven research. The talk will further discuss recent advancements leveraging AI/ML and offer insights into the future evolution of the fuzzing landscape.
Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42
Maintaining high-quality standards in the production of TMT bars is crucial for ensuring structural integrity in construction. Addressing common defects through careful monitoring, standardized processes, and advanced technology can significantly improve the quality of TMT bars. Continuous training and adherence to quality control measures will also play a pivotal role in minimizing these defects.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Top 10 Oil and Gas Projects in Saudi Arabia 2024.pdf
Iaetsd efficient filteration of unwanted messages
1. Efficient Filteration Of Unwanted Messages
in Social Networking Sites
Gunduboina Penchalaiah, Mr.Md.Amanatulla 2
1
Computer science and engineering.
2
Computer science and engineering
1
penchalaiah.gunduboina@gmail.com , amanatulla@gmail.com
Abstract:Social networking sites that
facilitate communication of information
between users allow users to post messages
as an important function. Unnecessary posts
could spam a user’s wall, which is the page
where posts are displayed, thus disabling
the user from viewing relevant
messages.TES ensures the “reputation” of
reporters by tracking how often the larger
recipient community agrees with their
assessment of a message. In addition, Trust
Evaluation System(TES) uses an automated
system of highly-proficient, fingerprinting
algorithms. Advanced Message
Fingerprinting maintain the privacy of the
content and reduce the amount of data to be
analyzed. Once a message fingerprint is
cataloged as spam, all future messages
matching that fingerprint are automatically
filtered. Because a reputation-based
collaborative system does not draw blanket
conclusions about terms, hosts, or people, it
has proven to increase accuracy, particularly
as it relates to false positives and critical
false positives, while simultaneously
decreasing administration costs.
Keywords:OnlineSocial
networking,Contention modeling,Trust
Evaluation System,False positives.
1.INTRODUCTION:
Today, there is a continued rise of social
networking on the Web. Social networking
accounts for 1 of every 6 minutes spent
online and as MySpace declines, LinkedIn,
Twitter and Tumblr have grown at
impressive rates [1]. Social media are
becoming increasingly important to
recruiters and jobseekers alike. In Online
Social Networks (OSNs), there is the
possibility of posting or commenting
unwanted messages on particular
public/private areas, called in general
“walls”. Unnecessary posts could spam a
user‟s wall, thus disabling the user from
viewing relevant messages. Information
filtering can therefore be used to give users
the ability to automatically control the
messages written on their own walls, by
filtering out unwanted messages.
OSNs today do not provide much support to
prevent unwanted messages on user walls.
For example, Facebook allows users to state
who is allowed to insert messages in their
walls (i.e., friends, friends of friends, or
defined groups of friends). Existing Filters as
browser extensions and add-ons are:
Spoiler Shield app to filter feeds from both
Facebook and twitter, for iPhone and Open
Proceedings of International Conference on Advances in Engineering and Technology
www.iaetsd.in
ISBN : 978 - 1505606395
International Association of Engineering and Technology for Skill Development
18
2. Tweet Filter which is a filter for twitter on
Chrome. These filters only take keywords
and filter out messages that contain the
specific words. Applications that are trying
to solve this issue through machine learning
techniques are either in the beta phase or
do not perform efficiently due to poor
learning curves used to analyze the
messages.
Most of the work related to text filtering by
Machine Learning has been applied for
long-form text. Wall messages are
constituted by short text for which
traditional classification methods have
serious limitations since short texts do not
provide sufficient word occurrences. Thus, a
suitable text representation method is
proposed in this paper along with a neural-
network based classification algorithm [3]
to classify each message as neutral or non-
neutral, based on its content.
Besides classification facilities, Filtering
Rules (FRs) can support a variety of
different filtering criteria that can be
combined and customized according to the
user needs. More precisely, FRs exploit user
profiles, user relationships as well as the
output of the classification process to state
the filtering criteria to be enforced. In
addition, the system provides the support
for user-defined BlackLists (BLs), that is, lists
of users that are temporarily prevented to
post any kind of messages on a user wall.
This system is intended to be a software
application, that is, an add-on, for any social
networking service that allows users to post
messages. The social networking application
is itself developed first with minimalistic
features, the most important being posting
short-textmessages. This is to emulate the
behaviour of existing OSNs like Facebook
and Twitter. Also, a database of users and
relationships between the users is
maintained. Thus, PostFilter acts as an
intelligent software on top of existing OSNs,
providing users with a view of clean (non-
vulgar and non-offensive) or relevant posts.
2. LITERATURE REVIEW
The main contribution of this paper is the
design of a system providing customizable
content-based message filtering for OSNs,
based on ML techniques. As we have
pointed out in the introduction, to the best
of our knowledge we are the first proposing
such kind of application for OSNs. However,
our work has relationships both with the
state of the art in content-based filtering, as
well as with the field of policy-based
personalization for OSNs and, more in
general, web contents. Therefore, in what
follows, we survey the literature in both
these fields.
2.1 Content-based filtering:
Information filtering systems are designed
to classify a stream of dynamically
generated information dispatched
asynchronously by an information producer
and present to the user those information
that are likely to satisfy his/her
requirements [3]. In content-based filtering
each user is assumed to operate
independently. As a result, a content-based
filtering system selects information items
based on the correlation between the
content of the items and the user
preferences as opposed to a collaborative
filtering system that chooses items based
on the correlation between people with
similar preferences. Documents processed
in content-based filtering are mostly textual
in nature and this makes content-based
filtering close to text classification. The
activity of filtering can be modeled, in fact,
Proceedings of International Conference on Advances in Engineering and Technology
www.iaetsd.in
ISBN : 978 - 1505606395
International Association of Engineering and Technology for Skill Development
19
3. as a case of single label, binary
classification, partitioning incoming
documents into relevant and non relevant
categories [4]. More complex filtering
systems include multi-label text
categorization automatically labeling
messages into partial thematic categories.
Content-based filtering is mainly based on
the use of the ML paradigm according to
which a classifier is automatically induced
by learning from a set of pre-classified
examples. A remarkable variety of related
work has recently appeared, which differ
for the adopted feature extraction
methods, model learning, and collection of
samples [5], [6], [7], [8],[9]. The feature
extraction procedure maps text into a
compact representation of its content and
is uniformly applied to training and
generalization phases. The application of
content-based filtering on messages posted
on OSN user walls poses additional
challenges given the short length of these
messages other than the wide range of
topics that can be discussed. Short text
classification has received up to now few
attention in the scientific community.
Recent work highlights difficulties in
defining robust features, essentially due to
the fact that the description of the short
text is concise, with many misspellings, non
standard terms and noise. Focusing on the
OSN domain, interest in access control and
privacy protection is quite recent. As far as
privacy is concerned, current work is mainly
focusing on privacy-preserving data mining
techniques, that is, protecting information
related to the network, i.e.,
relationships/nodes, while performing
social network analysis [5]. Works more
related to our proposals are those in the
field of access control. In this field, many
different access control models and related
mechanisms have been proposed so far
(e.g., [6,2,10]), which mainly differ on the
expressivity of the access control policy
language and on the way access control is
enforced (e.g., centralized vs.
decentralized). Most of these models
express access control requirements in
terms of relationships that the requestor
should have with the resource owner. We
use a similar idea to identify the users to
which a filtering rule applies. However, the
overall goal of our proposal is completely
different, since we mainly deal with filtering
of unwanted contents rather than with
access control. As such, one of the key
ingredients of our system is the availability
of a description for the message contents to
be exploited by the filtering mechanism as
well as by the language to express filtering
rules. In contrast, no one of the access
control models previously cited exploit the
content of the resources to enforce access
control. We believe that this is a
fundamental difference. Moreover, the
notion of black- lists and their management
are not considered by any of these access
control models
2.2 Policy-based personalization of OSN
contents
Recently, there have been some proposals
exploiting classification mechanisms for
personalizing access in OSNs. For instance,
in [11] a classification method has been
proposed to categorize short text messages
in order to avoid overwhelming users of
microblogging services by raw data. The
system described in [11] focuses on Twitter2
and associates a set of categories with each
tweet describing its content. The user can
then view only certain types of tweets based
on his/her interests. In contrast, Golbeck and
Kuter [12] propose an application, called
FilmTrust, that exploits OSN trust
relationships and provenance information to
Proceedings of International Conference on Advances in Engineering and Technology
www.iaetsd.in
ISBN : 978 - 1505606395
International Association of Engineering and Technology for Skill Development
20
4. personalize access to the website. However,
such systems do not provide a filtering
policy layer by which the user can exploit
the result of the classification process to
decide how and to which extent filtering out
unwanted information[15]. In contrast, our
filtering policy language allows the setting
of FRs according to a variety of criteria, that
do not consider only the results of the
classification process but also the
relationships of the wall owner with other
OSN users as well as information on the
user profile. Moreover, our system is
complemented by a flexible mechanism for
BL management that provides a further
opportunity of customization to the filtering
procedure. The only social networking
service we are aware of providing filtering
abilities to its users is MyWOT, a social
networking service which gives its
subscribers the ability to: 1) rate resources
with respect to four criteria: trustworthiness,
vendor reliability, privacy, and child safety;
2) specify preferences determining whether
the browser should block access to a given
resource, or should simply return a warning
message on the basis of the specified rating.
Despite the existence of some similarities,
the approach adopted by MyWOT is quite
different from ours. In particular, it supports
filtering criteria which are far less flexible
than the ones of Filtered Wall since they are
only based on the four above-mentioned
criteria. Moreover, no automatic
classification mechanism is provided to the
end user. Our work is also inspired by the
many access control models and related
policy languages and enforcement
mechanisms that have been proposed so far
for OSNs, since filtering shares several
similarities with access control. Actually,
content filtering can be considered as an
extension of access control, since it can be
used both to protect objects from
unauthorized subjects, and subjects from
inappropriate objects. In the field of OSNs,
the majority of access control models
proposed so far enforce topology-based
access control, according to which access
control requirements are expressed in terms
of relationships that the requester should
have with the resource owner. We use a
similar idea to identify the users to which a
FR applies. However, our filtering policy
language extends the languages proposed for
access control policy specification in OSNs
to cope with the extended requirements of
the filtering domain. Indeed, since we are
dealing with filtering of unwanted contents
rather than with access control, one of the
key ingredients of our system is the
availability of a description for the message
contents to be exploited by the filtering
mechanism.
3.TRUST EVALUATION SYSTEM (TES):
TES is the reputation metric, or trust system,
that evaluates every new piece of feedback
submitted to the Nomination servers. The
primary function of TeS is to assign a
“confidence” to fingerprints—a value
between Cmn (legitimate) and Cmx (spam),
based on the “reputation” or “trust level” of
the individual reporting the fingerprint. The
trust level, t, is a finite numeric value
attached to every community reporter. The
value t is, in turn, computed from the
corroborated historical confidence of the
fingerprints nominated by the reporter. The
circular assignment effectively turns the
classifier into a stable closed-loop control
Proceedings of International Conference on Advances in Engineering and Technology
www.iaetsd.in
ISBN : 978 - 1505606395
International Association of Engineering and Technology for Skill Development
21
5. system.
Figure 1: Process Flow of the Trust
Evaluation System
TES determines both the confidence the
community has in the disposition of a
fingerprint and the trust that the system
places in the decisions made by members of
the community. In a continuous process,
members of the community receive new
spam (1) and report their feelings (“spam”
or “not spam”) about the message to the
Nomination server, which in turn reports it
to TES (2). Based upon the trust associated
with each individual reporter, TES assigns
confidence to the fingerprint (3) and reports
it to the service for distribution to the
community (4). TES then reevaluates the
community trust values to determine who
should gain and lose trust as a result of their
individual assessments of the message.
Just as in the real world, trust is earned
slowly and is difficult to attain. New
recipients start with a trust level of zero. In
the very beginning (at the launch of the
classifier), there were only a few hand-
picked recipients with a high trust level. As
zero-valued, untrusted community members
provide feedback, TES rewards reporters
whose feedback agrees with those of highly-
trusted, highly-reputable members of the
community. In other words, TES assigns
trust points to recipients when their reports
are corroborated by other highly trusted
recipients[18]. In practice, for every
fingerprint that achieves a high confidence
meaning that the fingerprint was reported
and corroborated by highly trusted
recipients.TES gives one of first reporters of
the fingerprint a small trust reward.
Untrusted recipients who report often and
report correctly eventually accrue enough
trust rewards to become trusted recipients
themselves. Once trusted, they implicitly
begin to participate in the process of
selecting newer trusted recipients. In this
manner, TES selectively inducts a
community of “highly-reputable,” “highly-
trusted” members—reporters who routinely
make decisions that are honored by the rest
of the community. TES also penalizes
recipients who disagree with the trusted
majority. Penalties are harsher than rewards,
so while gaining trust is hard, losing it is
rather easy.
The second aspect of TES’s responsibility is
to assign confidence to fingerprints.
Fingerprint confidence is a function of the
reporter’s trust level and the disposition
(block/unblock) of their reports. TES
Proceedings of International Conference on Advances in Engineering and Technology
www.iaetsd.in
ISBN : 978 - 1505606395
International Association of Engineering and Technology for Skill Development
22
6. updates confidence in real-time with every
report. Once the confidence reaches a
threshold, known as average spam
confidence, it is promoted to Catalog
servers. If a promoted fingerprint is
unblocked by trusted recipients, its
confidence can drop below the average spam
confidence, which results in its immediate
removal from the catalog servers. The real-
time nature of confidence assignments
results in an extremely responsive system
that can self correct within seconds.
In more formal terms, a given set of
fingerprint reporters, R, who each have a
trust level tr , send in reports that have a
disposition dr , where dr = -1 if the
fingerprint misclassifies a legitimate
message as spam and tr = 1 if the message is
spam. After a number of fingerprints are
collected, it is possible to compute a
fingerprint confidence using the following
equation: However, it is important to note
that TeS uses a variation of the above
algorithm to reduce its attack vulnerability.
3.1.EMERGENT PROPERTIES OF TES
TES has several desirable, even surprising,
emergent properties when deployed on a
large scale. These properties are critical to
the effectiveness of the system and typical
of well-designed reputation metrics. We
discuss some of these properties in this
section and contrast them with related
properties of other anti-spam approaches.
3.1.1.Responsiveness
TES’s reward selection metric prefers those
recipients who report correctly and early.
This means that over time TES can identify
all such reporters whose initial reports have
a high likelihood of being accepted as spam
by the rest of the community. As the group
of trusted recipients becomes larger, the first
few reports are extremely reliable predictors
of a fingerprint’s final disposition. As a
result, TES can respond extremely quickly
to new spam attacks.
Anti-spam methods that either require expert
supervision, or that are inherently unable to
train on individual samples, have
significantly longer response latencies.
These systems are unable to stop short-lived
attacks that are not already addressable by
their existing filtering hypotheses.
3.1.2.Self Correction
The ability to make negative assertions
(“Message is not spam”), combined with the
dynamic nature of the confidence
assignment algorithm, permits speedy self-
correction when the initial prediction is
incompatible with the consensus view. Since
confidence and trust assignments are
intertwined, community disagreement
results in immediate correction of the
confidence of fingerprints, as well as a trust
reduction for reporting the fingerprints as
spam. This results in a historical trend
toward accuracy because only the reporters
who consistently make decisions aligned
with the consensus retain their trusted status.
From a learning perspective, the reporter’s
Proceedings of International Conference on Advances in Engineering and Technology
www.iaetsd.in
ISBN : 978 - 1505606395
International Association of Engineering and Technology for Skill Development
23
7. reputation or trust values represent the entire
history of good decisions and mistakes made
by the classifier.
3.1.3.Modeling Disagreement
One of things we learned almost
immediately after the launch of TES was
that certain fingerprints would wildly flip-
flop across the average spam confidence
level. These fingerprints usually represented
newsletters and mass mailings that were
considered desirable by some and
undesirable by others. The community of
trusted recipients disagreed on the
disposition of these fingerprints because
there was no “real” community consensus
on whether or not the message was spam.
By modeling the pattern of disagreement,
we taught TES to identify this kind of
disagreement and flag such fingerprints as
contested. When agents query contested
fingerprints, they are informed of the
contention status so they can classify the
source emails based on out-of-band criteria,
which can be defined subjectively for all
recipients.
Contention modeling is extremely important
for a collaborative classifier because it
scopes the precision of the system. If the
limitations of the classifier are known, other
classification methods can be invoked as
required. In the TES contention logic is also
a catch-all defense against fingerprint
collision. If a set of spam and legitimate
email happen to generate the same
fingerprint, the fingerprint is flagged as
contested, which excludes its disposition
from the classification decision. Historically
aggregated contention rates in the service
are an indicator of the level of disagreement
in the trusted community. The level of
disagreement in the service is very low,
which implies that the trust model can
successfully represent the collective wisdom
of the community.
Most machine learning systems, including
statistical text classifiers like Naive Bayes,
are unable to automatically identify
contested documents. This is why statistical
classifiers tend to work better in single-user
environments where recipient preferences
are consistent over time
3.1.4.Resistance to Attack
An open, user feedback-driven system is an
attractive attack target for spammers. There
are essentially two ways of attacking the
service. One is through a technique called
hash busting, which attacks fingerprinting
algorithms by forcing them to generate
different fingerprints for each mutation of a
spam message.. The second vector for attack
is through incorrect feedback.
Most commonly, attackers attempt to
unblock their mailings before broadcasting
them to the general population. However, an
attacker must first be considered trusted in
order to affect the disposition of a
fingerprint. In order to gain trust, the
attacker must provide useful feedback over a
long period of time, which requires blocking
Proceedings of International Conference on Advances in Engineering and Technology
www.iaetsd.in
ISBN : 978 - 1505606395
International Association of Engineering and Technology for Skill Development
24
8. spam that others consider to be spam. In
other words, spammers must behave like
good recipients for an extended period of
time to get even a single identity to be
considered trusted. If they do spend the
effort building up a trusted identity, the
amount of damage they can do with one or
few trusted identities is negligible because
the disagreement from the majority of the
trust community will result in harsh trust
penalties for the spammer identities. As the
pool of trusted users grows, it gets harder to
gain trust and easier to lose it. Participation
is proportional to the strength of attack
resistance.
Expert-supervised systems are resistant to
such attacks by definition but are unable to
scale to a large number of experts. Similarly,
statistical text classification systems must go
through supervised training to avoid corpus
pollution. Supervision limits the amount of
training data that can be considered. One
real-world limitation, for example, is in a
supervised classification system’s inability
to adequately detect “foreign language”
spam—that is, spam in languages not
understood by supervisors.
4.CONCLUSION:
In this paper, we have proposed an approach
TES ensures the “reputation” of reporters by
tracking how often the larger recipient
community agrees with their assessment of a
message. In addition, Trust Evaluation
System(TES) uses an automated system of
highly-proficient, fingerprinting algorithms.
Advanced Message Fingerprinting maintain
the privacy of the content and reduce the
amount of data to be analyzed.we improve
the global performance of the TES model to
filter out unwanted messages from Online
Social Networking (OSN).
5.REFERENCES:
[1] Mr.Md.Amanatulla, J. Alcalá-Fdez, F.
Herrera, J. Otero, Genetic learning of
accurate and compact fuzzy rule based
systems based on the 2-tuples linguistic
representation, International Journal of
Approximate Reasoning 44 (2007) 4564.
[2] A. Asuncion, D. Newman, 2007. UCI
machine learning repository. University of
California, Irvine, School of Information
and Computer Sciences. URL:
<http://www.ics.uci.edu/~mlearn/MLReposi
tory.html>.
[3] R. Barandela, J.S. Sánchez, V. García, E.
Rangel, Strategies for learning in class
imbalance problems, Pattern Recognition 36
(3) (2003) 849–851.
[4] G.E.A.P.A. Batista, R.C. Prati, M.C.
Monard, A study of the behaviour of several
methods for balancing machine learning
training data, SIGKDD Explorations 6 (1)
(2004) 20–29.
[5] P. Campadelli, E. Casiraghi, G.
Valentini, Support vector machines for
candidate nodules classification, Letters on
Neurocomputing 68 (2005) 281–288.
[6] J.R. Cano, F. Herrera, M. Lozano, Using
evolutionary algorithms as instance selection
for data reduction in kdd: an experimental
study, IEEE Transactions on Evolutionary
Computation 7 (6) (2003) 561–575.
[7] N.V. Chawla, K.W. Bowyer, L.O. Hall,
W.P. Kegelmeyer, Smote: synthetic
minority over-sampling technique, Journal
of Artificial Intelligent Research 16 (2002)
321–357.
[8] N.V. Chawla, N. Japkowicz, A. Kolcz,
Editorial: special issue on learning from
Proceedings of International Conference on Advances in Engineering and Technology
www.iaetsd.in
ISBN : 978 - 1505606395
International Association of Engineering and Technology for Skill Development
25
9. imbalanced data-sets, SIGKDD Explorations
6 (1) (2004) 1–6.
[9] Z. Chi, H. Yan, T. Pham, Fuzzy
algorithms with applications to image
processing and pattern recognition, World
Scientific, 1996.
[10] J.-N. Choi, S.-K. Oh, W. Pedrycz,
Structural and parametric design of fuzzy
inference systems using hierarchical fair
competition-based parallel genetic
algorithms and information granulation,
International Journal of Approximate
Reasoning 49 (3) (2008) 631–648.
[11] D. Nelms, “Social networking growth
stats and patterns”,
http://socialmediatoday.com/amzini/306252/
social-networking-growth-stats-and-patterns
[12] M. Vanetti, E. Binaghi, E. Ferrari, B.
Carminati, M. Carullo, “A System to Filter
Unwanted Messages from OSN User
Walls”, IEEE Transactions on Knowledge
and Data Engineering, Vol:25, June 2013
[13] F. Sebastiani, “Machine learning in
automated text categorization,” ACM
Computing Surveys, vol. 34, no. 1, pp. 1–
47, 2002
[14] C. D. Manning, P. Raghavan, and H.
Schutze, Introduction to Information
Retrieval. Cambridge, UK: Cambridge
UniversityPress, 2008.
[15] Wikipedia page on “Multilayer
Perceptron”,
http://en.wikipedia.org/wiki/Multilayer_perc
eptron
[16] Wikipedia page about “Deep
Learning”,
http://en.wikipedia.org/wiki/Deep_learning
[17] Jiawei Han and Micheline Kamber,
“Data Mining: Concepts and Techniques”,
Second Edition
[18] Wikipedia page on “Precision and
recall”,
http://en.wikipedia.org/wiki/Precision_and_r
ecall.
Proceedings of International Conference on Advances in Engineering and Technology
www.iaetsd.in
ISBN : 978 - 1505606395
International Association of Engineering and Technology for Skill Development
26