Resource management and search is very important yet challenging in large-scale distributed systems like
P2Pnetworks. Most existing P2P systems rely on indexing to efficiently route queries over the network.
However, searches based on such indices face two key issues. First, majority of existing search schemes
often rely on simply keyword based indices that can only support exact string based matches without taking
into account the meaning of words. Second it is difficult, if not impossible, to devise query based indexing
schemes that can represent all possible concept combinations without resulting in exponential index sizes.
To address these problems, we present BSI, a novel P2P indexing and query routing strategy to support
semantic based content searches. The BSI indexing structure captures the semantic content of documents
using a reference ontology. Our indexing scheme can efficiently handle multi-concept queries by
maintaining summary level information for each individual concept and concept combinations using a
novel space-efficient Two-level Semantic Bloom Filter(TSBF) data structure. By using TSBFs to represent
a large document and query base, BSI significantly reduces the communication cost and storage cost of
indices. Furthermore, We devise a low-overhead mechanism to allow peers to dynamically estimate the
relevance strength of a peer for multi-concept queries with high accuracy solely based on TSBFs. We also
propose a routing index compression mechanism to observe peers’ dynamic storage limitations with
minimal loss of information by exploiting a reference ontology structure. Based on the proposed index
structure, we design a novel query routing algorithm that exploits semantic based information to route
queries to semantically relevant peers. Performance evaluation demonstrates that our proposed approach
can improve the search recall of unstructured P2P systems up to 383.71% while keeping the
communication cost at a low level compared to state-of-art search mechanism OSQR [7].
P2P DOMAIN CLASSIFICATION USING DECISION TREE ijp2p
The increasing interest in Peer-to-Peer systems (such as Gnutella) has inspired many research activities
in this area. Although many demonstrations have been performed that show that the performance of a
Peer-to-Peer system is highly dependent on the underlying network characteristics, much of the
evaluation of Peer-to-Peer proposals has used simplified models that fail to include a detailed model of
the underlying network. This can be largely attributed to the complexity in experimenting with a scalable
Peer-to-Peer system simulator built on top of a scalable network simulator. A major problem of
unstructured P2P systems is their heavy network traffic. In Peer-to-Peer context, a challenging problem
is how to find the appropriate peer to deal with a given query without overly consuming bandwidth?
Different methods proposed routing strategies of queries taking into account the P2P network at hand.
This paper considers an unstructured P2P system based on an organization of peers around Super-Peers
that are connected to Super-Super-Peer according to their semantic domains; in addition to integrating
Decision Trees in P2P architectures to produce Query-Suitable Super-Peers, representing a community
of peers where one among them is able to answer the given query. By analyzing the queries log file, a
predictive model that avoids flooding queries in the P2P network is constructed after predicting the
appropriate Super-Peer, and hence the peer to answer the query. A challenging problem in a schemabased Peer-to-Peer (P2P) system is how to locate peers that are relevant to a given query. In this paper,
architecture, based on (Super-)Peers is proposed, focusing on query routing. The approach to be
implemented, groups together (Super-)Peers that have similar interests for an efficient query routing
method. In such groups, called Super-Super-Peers (SSP), Super-Peers submit queries that are often
processed by members of this group. A SSP is a specific Super-Peer which contains knowledge about: 1.
its Super-Peers and 2. The other SSP. Knowledge is extracted by using data mining techniques (e.g.
Decision Tree algorithms) starting from queries of peers that transit on the network. The advantage of
this distributed knowledge is that, it avoids making semantic mapping between heterogeneous data
sources owned by (Super-)Peers, each time the system decides to route query to other (Super-) Peers.
The set of SSP improves the robustness in queries routing mechanism, and the scalability in P2P
Network. Compared with a baseline approach,the proposal architecture shows the effect of the data
mining with better performance in respect to response time and precision.
For further details contact:
N.RAJASEKARAN B.E M.S 9841091117,9840103301.
IMPULSE TECHNOLOGIES,
Old No 251, New No 304,
2nd Floor,
Arcot road ,
Vadapalani ,
Chennai-26.
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYcscpconf
A digital library is a type of information retrieval (IR) system. The existing information retrieval
methodologies generally have problems on keyword-searching. We proposed a model to solve
the problem by using concept-based approach (ontology) and metadata case base. This model
consists of identifying domain concepts in user’s query and applying expansion to them. The
system aims at contributing to an improved relevance of results retrieved from digital libraries
by proposing a conceptual query expansion for intelligent concept-based retrieval. We need to
import the concept of ontology, making use of its advantage of abundant semantics and
standard concept. Domain specific ontology can be used to improve information retrieval from
traditional level based on keyword to the lay based on knowledge (or concept) and change the
process of retrieval from traditional keyword matching to semantics matching. One approach is
query expansion techniques using domain ontology and the other would be introducing a case
based similarity measure for metadata information retrieval using Case Based Reasoning
(CBR) approach. Results show improvements over classic method, query expansion using
general purpose ontology and a number of other approaches.
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficientlyijsrd.com
Efficient and effective full-text retrieval in unstructured peer-to-peer networks remains a challenge in the research community. First, it is difficult, if not impossible, for unstructured P2P systems to effectively locate items with guaranteed recall. Second, existing schemes to improve search success rate often rely on replicating a large number of item replicas across the wide area network, incurring a large amount of communication and storage costs. In this paper, we propose BloomCast, an efficient and effective full-text retrieval scheme, in unstructured P2P networks. By leveraging a hybrid P2P protocol, BloomCast replicates the items uniformly at random across the P2P networks, achieving a guaranteed recall at a communication cost of O (N), where N is the size of the network. Furthermore, by casting Bloom Filters instead of the raw documents across the network, BloomCast significantly reduces the communication and storage costs for replication. Results show that BloomCast achieves an average query recall, which outperforms the existing WP algorithm by 18 percent, while BloomCast greatly reduces the search latency for query processing by 57 percent.
A Proximity-Aware Interest-Clustered P2P File Sharing System 1crore projects
IEEE PROJECTS 2015
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Dot Net
DOTNET Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
Java Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2015
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US
1 CRORE PROJECTS
Door No: 214/215,2nd Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 72999 51536
For further details contact:
N.RAJASEKARAN B.E M.S 9841091117,9840103301.
IMPULSE TECHNOLOGIES,
Old No 251, New No 304,
2nd Floor,
Arcot road ,
Vadapalani ,
Chennai-26.
www.impulse.net.in
Email: ieeeprojects@yahoo.com/ imbpulse@gmail.com
P2P DOMAIN CLASSIFICATION USING DECISION TREE ijp2p
The increasing interest in Peer-to-Peer systems (such as Gnutella) has inspired many research activities
in this area. Although many demonstrations have been performed that show that the performance of a
Peer-to-Peer system is highly dependent on the underlying network characteristics, much of the
evaluation of Peer-to-Peer proposals has used simplified models that fail to include a detailed model of
the underlying network. This can be largely attributed to the complexity in experimenting with a scalable
Peer-to-Peer system simulator built on top of a scalable network simulator. A major problem of
unstructured P2P systems is their heavy network traffic. In Peer-to-Peer context, a challenging problem
is how to find the appropriate peer to deal with a given query without overly consuming bandwidth?
Different methods proposed routing strategies of queries taking into account the P2P network at hand.
This paper considers an unstructured P2P system based on an organization of peers around Super-Peers
that are connected to Super-Super-Peer according to their semantic domains; in addition to integrating
Decision Trees in P2P architectures to produce Query-Suitable Super-Peers, representing a community
of peers where one among them is able to answer the given query. By analyzing the queries log file, a
predictive model that avoids flooding queries in the P2P network is constructed after predicting the
appropriate Super-Peer, and hence the peer to answer the query. A challenging problem in a schemabased Peer-to-Peer (P2P) system is how to locate peers that are relevant to a given query. In this paper,
architecture, based on (Super-)Peers is proposed, focusing on query routing. The approach to be
implemented, groups together (Super-)Peers that have similar interests for an efficient query routing
method. In such groups, called Super-Super-Peers (SSP), Super-Peers submit queries that are often
processed by members of this group. A SSP is a specific Super-Peer which contains knowledge about: 1.
its Super-Peers and 2. The other SSP. Knowledge is extracted by using data mining techniques (e.g.
Decision Tree algorithms) starting from queries of peers that transit on the network. The advantage of
this distributed knowledge is that, it avoids making semantic mapping between heterogeneous data
sources owned by (Super-)Peers, each time the system decides to route query to other (Super-) Peers.
The set of SSP improves the robustness in queries routing mechanism, and the scalability in P2P
Network. Compared with a baseline approach,the proposal architecture shows the effect of the data
mining with better performance in respect to response time and precision.
For further details contact:
N.RAJASEKARAN B.E M.S 9841091117,9840103301.
IMPULSE TECHNOLOGIES,
Old No 251, New No 304,
2nd Floor,
Arcot road ,
Vadapalani ,
Chennai-26.
INTELLIGENT INFORMATION RETRIEVAL WITHIN DIGITAL LIBRARY USING DOMAIN ONTOLOGYcscpconf
A digital library is a type of information retrieval (IR) system. The existing information retrieval
methodologies generally have problems on keyword-searching. We proposed a model to solve
the problem by using concept-based approach (ontology) and metadata case base. This model
consists of identifying domain concepts in user’s query and applying expansion to them. The
system aims at contributing to an improved relevance of results retrieved from digital libraries
by proposing a conceptual query expansion for intelligent concept-based retrieval. We need to
import the concept of ontology, making use of its advantage of abundant semantics and
standard concept. Domain specific ontology can be used to improve information retrieval from
traditional level based on keyword to the lay based on knowledge (or concept) and change the
process of retrieval from traditional keyword matching to semantics matching. One approach is
query expansion techniques using domain ontology and the other would be introducing a case
based similarity measure for metadata information retrieval using Case Based Reasoning
(CBR) approach. Results show improvements over classic method, query expansion using
general purpose ontology and a number of other approaches.
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficientlyijsrd.com
Efficient and effective full-text retrieval in unstructured peer-to-peer networks remains a challenge in the research community. First, it is difficult, if not impossible, for unstructured P2P systems to effectively locate items with guaranteed recall. Second, existing schemes to improve search success rate often rely on replicating a large number of item replicas across the wide area network, incurring a large amount of communication and storage costs. In this paper, we propose BloomCast, an efficient and effective full-text retrieval scheme, in unstructured P2P networks. By leveraging a hybrid P2P protocol, BloomCast replicates the items uniformly at random across the P2P networks, achieving a guaranteed recall at a communication cost of O (N), where N is the size of the network. Furthermore, by casting Bloom Filters instead of the raw documents across the network, BloomCast significantly reduces the communication and storage costs for replication. Results show that BloomCast achieves an average query recall, which outperforms the existing WP algorithm by 18 percent, while BloomCast greatly reduces the search latency for query processing by 57 percent.
A Proximity-Aware Interest-Clustered P2P File Sharing System 1crore projects
IEEE PROJECTS 2015
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Dot Net
DOTNET Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
Java Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2015
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US
1 CRORE PROJECTS
Door No: 214/215,2nd Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 72999 51536
For further details contact:
N.RAJASEKARAN B.E M.S 9841091117,9840103301.
IMPULSE TECHNOLOGIES,
Old No 251, New No 304,
2nd Floor,
Arcot road ,
Vadapalani ,
Chennai-26.
www.impulse.net.in
Email: ieeeprojects@yahoo.com/ imbpulse@gmail.com
A SEMANTIC METADATA ENRICHMENT SOFTWARE ECOSYSTEM BASED ON TOPIC METADATA ENR...IJDKP
As existing computer search engines struggle to understand the meaning of natural language, semantically
enriched metadata may improve interest-based search engine capabilities and user satisfaction.
This paper presents an enhanced version of the ecosystem focusing on semantic topic metadata detection
and enrichments. It is based on a previous paper, a semantic metadata enrichment software ecosystem
(SMESE). Through text analysis approaches for topic detection and metadata enrichments this paper
propose an algorithm to enhance search engines capabilities and consequently help users finding content
according to their interests. It presents the design, implementation and evaluation of SATD (Scalable
Annotation-based Topic Detection) model and algorithm using metadata from the web, linked open data,
concordance rules, and bibliographic record authorities. It includes a prototype of a semantic engine using
keyword extraction, classification and concept extraction that allows generating semantic topics by text,
and multimedia document analysis using the proposed SATD model and algorithm.
The performance of the proposed ecosystem is evaluated using a number of prototype simulations by
comparing them to existing enriched metadata techniques (e.g., AlchemyAPI, DBpedia, Wikimeta, Bitext,
AIDA, TextRazor). It was noted that SATD algorithm supports more attributes than other algorithms. The
results show that the enhanced platform and its algorithm enable greater understanding of documents
related to user interests.
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...University of Bari (Italy)
The current abundance of electronic documents requires automatic techniques that support the users in understanding their content and extracting useful information. To this aim, improving the retrieval performance must necessarily go beyond simple lexical interpretation of the user queries, and pass through an understanding of their semantic content and aims. It goes without saying that any digital library would take enormous advantage from the availability of effective Information Retrieval techniques to provide to their users. This paper proposes an approach to Information Retrieval based on a correspondence of the domain of discourse between the query and the documents in the repository. Such an association is based on standard general-purpose linguistic resources (WordNet and WordNet Domains) and on a novel similarity assessment technique. Although the work is at a preliminary stage, interesting initial results suggest to go on extending and improving the approach.
Advance Clustering Technique Based on Markov Chain for Predicting Next User M...idescitation
According to the survey India is one of the
leading countries in the word for technical education and
management education. Numbers of students are increasing
day by day by the growth rate of 45% per annum. Advancement
in technology puts special effect on education system. This
helps in upgrading higher education. Some universities and
colleges are using these technologies. Weblog is one of them.
Main aim of this paper is to represent web logs using clustering
technique for predicting next user movement and user
behavior analysis. This paper moves around the web log
clustering technique based on Markov chain results .In this
paper we present an ideal approach to web clustering
(clustering web site users) and predicting their behavior for
next visit. Methodology: For generating effective result approx
14 engineering college web usage data is used and an advance
clustering approach is presenting after optimizing the other
clustering approach.Results: The user behavior is predicted
with the help of the advance clustering approach based on the
FPCM and k-mean. Proposed algorithm is used to mined and
predict user’s preferred paths. To predict the user behavior
existing approaches have been used. But the existing
approaches are not enough because of its reaction towards
noise. Thus with the help of ACM, noise is reduced, provides
more accurate result for predicting the user behavior. Approach
Implementation:The algorithm was implemented in MAT
LAB, DTRG and in Java .The experiment result proves that
this method is very effective in predicting user behavior. The
experimental results have validated the method’s effectiveness
in comparison with some previous studies.
International Journal of Engineering Inventions (IJEI) provides a multidisciplinary passage for researchers, managers, professionals, practitioners and students around the globe to publish high quality, peer-reviewed articles on all theoretical and empirical aspects of Engineering and Science.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
Context Driven Technique for Document ClassificationIDES Editor
In this paper we present an innovative hybrid Text
Classification (TC) system that bridges the gap between
statistical and context based techniques. Our algorithm
harnesses contextual information at two stages. First it extracts
a cohesive set of keywords for each category by using lexical
references, implicit context as derived from LSA and wordvicinity
driven semantics. And secondly, each document is
represented by a set of context rich features whose values are
derived by considering both lexical cohesion as well as the extent
of coverage of salient concepts via lexical chaining. After
keywords are extracted, a subset of the input documents is
apportioned as training set. Its members are assigned categories
based on their keyword representation. These labeled
documents are used to train binary SVM classifiers, one for
each category. The remaining documents are supplied to the
trained classifiers in the form of their context-enhanced feature
vectors. Each document is finally ascribed its appropriate
category by an SVM classifier.
INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLP ijnlc
Information retrieval is becoming an intricate part of every domain. Be it in acquiring data from various sources to form a single unit or to present the data in such a way that anyone can extract useful information and hence used in data analysis, data mining etc. This arena has gained much importance in the recent years because as of today we are exploded with various kind of information from the real-world. The growing importance of research data and retrieving the intelligent data are the main focus for any business today. So coming years this is a field where major work need to be done. We have focused here to implement a system for information retrieval from the webpages using Natural Language Processing (NLP) and have shown to getting better results than the existing system. Webpages is a home to huge amount of information from various entities in the real-world. Here we have designed a system for information retrieval technique for web using NLP where techniques Hierarchical Conditional Random Fields (i.e. HCRF) and extended Semi-Markov Conditional Random Fields (i.e. Semi-CRF) along with Visual Page Segmentation is used to get the accurate results. Also parallel processing is used to achieve the results in desired time frame. It further improves the decision making between HCRF and Semi-CRF by using bidirectional approach rather than top-down approach. It enables better understanding of the content and page structure.
The classic mutual exclusion problem in distributed systems occurs when only one process should access a
shared resource. Various algorithms are proposed in order to solve this problem. When using a permission
based approach which consist in exchanging permission messages to grant access to the critical resource,
less messages should be sent over the network because bandwidth consumption and synchronization delay
should be reduced. Richa, shikha and Pooja proposed an algorithm using nodes logically organized in a
complete binary tree. This algorithm called NTBCBT requires 4log2(N) messages per access to critical
section and a synchronization delay of 3log2(N) for a set of N nodes competing for the critical ressource. In
this paper, we study NTBCBT and we show that this algorithm has problems related with safety, liveness
and scheduling. We improve this algorithm by correcting these weaknesses. Moreover, our algorithm
requires 3log(N) messages per access to critical section and a synchronization delay of 2log(N). This
improvement is due to the removal of useless messages, a reorganization of instructions on each node and
an insertion of access requests using their timestamp.
This is a complete automated solution for the existing energy distribution and monitoring system in
India,which can monitor the meter readings continuously and take necessary actions to maintain the power
grid stable. A Power Line Communication (PLC) based modem is integrated with each electronic energy
meter. Through PLC the meters communicate with the coordinator. Coordinator makes use of GPRS modem
to upload/download data to/from internet. A personal computer with an internet connection at the other end,
which contains the database acts as the billing point. Live meter reading sent back to this billing point
periodically and these details are updated in a central database. An interactive, user friendly graphical
interface is present at user end. All the energy logs, notices from the Government, billing details and average
statistics will be available here. The system splits the loads into critical loads and non critical loads. This
makes the distribution system more intelligent. More over prior information about the power cuts can be
done. We can easily implement many add-ons such as energy demand prediction, real time dynamic tariff as
a function of demand and supply and so on.
Wireless sensor network are emerging in various fields like environmental monitoring, mining, surveillance
system, medical monitoring. LEACH protocol is one of the predominantly used clustering routing protocols
in wireless sensor networks. In Leach each node has equal chance to become a cluster head which make
the energy dissipated of each node be moderately balanced. We have pioneered an improved algorithm
named as Novel Leach based on Leach protocol. The proposed algorithm shows the significant
improvement in network lifetime .Comparison of proposed algorithm is done with basic leach in terms of
network life time, cluster head selection, energy consumption, and data transmission to base station. The
simulation results shows that proposed algorithm can reduce network energy consumption and prolong
network life commendably. Simulation of our protocol is done with Matlab.
Diseño de placa plana cuadrada, concreto reforzado IIEnrique Santana
Diseñe una losa plana cuadrada con las siguientes consideraciones:
- Columnas de 48 cm x 48 cm, altura de 4.6 m.
- Resistencia del concreto: f_c^'=280 kg⁄〖cm〗^2 .
- Resistencia del acero: f_y=4,200 kg⁄〖cm〗^2 .
In recent years, research efforts tried to exploit peer-to-peer (P2P) systems in order to provide Live Streaming (LS) and Video-on-Demand (VoD) services. Most of these research efforts focus on the development of distributed P2P block schedulers for content exchange among the participating peers and on the characteristics of the overlay graph (P2P overlay) that interconnects the set of these peers.Currently, researchers try to combine peer-to-peer systems with cloud infrastructures. They developed monitoring and control architectures that use resources from the cloud in order to enhance QoS and achieve an attractive trade-off between stability and low cost operation. However, there is a lack of
research effort on the congestion control of these systems and the existing congestion control architectures are not suitable for P2P live streaming traffic (small sequential non persistent traffic towards multiple network locations). This paper proposes a P2P live streaming traffic aware congestion control protocol that: i) is capable to manage sequential traffic heading to multiple network destinations , ii) efficiently exploits the available bandwidth, iii) accurately measures the idle peer resources, iv) avoids network congestion, and v) is friendly to traditional TCP generated traffic.The proposed P2P congestion control has been implemented, tested and evaluated through a series of real experiments powered across the BonFIRE infrastructure.
Load balancing is one of the main challenges of every structured peer-to-peer (P2P) system that uses
distributed hash tables to map and distribute data items (objects) onto the nodes of the system. In a typical
P2P system with N nodes, the use of random hash functions for distributing keys among peer nodes can
lead to O(log N) imbalance. Most existing load balancing algorithms for structured P2P systems are not
adaptable to objects’ variant loads in different system conditions, assume uniform distribution of objects in
the system, and often ignore node heterogeneity. In this paper we propose a load balancing algorithm that
considers the above issues by applying node movement and replication mechanisms while load balancing.
Given the high overhead of replication, we postpone this mechanism as much as possible, but we use it
when necessary. Simulation results show that our algorithm is able to balance the load within 85% of the
optimal value.
A SEMANTIC METADATA ENRICHMENT SOFTWARE ECOSYSTEM BASED ON TOPIC METADATA ENR...IJDKP
As existing computer search engines struggle to understand the meaning of natural language, semantically
enriched metadata may improve interest-based search engine capabilities and user satisfaction.
This paper presents an enhanced version of the ecosystem focusing on semantic topic metadata detection
and enrichments. It is based on a previous paper, a semantic metadata enrichment software ecosystem
(SMESE). Through text analysis approaches for topic detection and metadata enrichments this paper
propose an algorithm to enhance search engines capabilities and consequently help users finding content
according to their interests. It presents the design, implementation and evaluation of SATD (Scalable
Annotation-based Topic Detection) model and algorithm using metadata from the web, linked open data,
concordance rules, and bibliographic record authorities. It includes a prototype of a semantic engine using
keyword extraction, classification and concept extraction that allows generating semantic topics by text,
and multimedia document analysis using the proposed SATD model and algorithm.
The performance of the proposed ecosystem is evaluated using a number of prototype simulations by
comparing them to existing enriched metadata techniques (e.g., AlchemyAPI, DBpedia, Wikimeta, Bitext,
AIDA, TextRazor). It was noted that SATD algorithm supports more attributes than other algorithms. The
results show that the enhanced platform and its algorithm enable greater understanding of documents
related to user interests.
A Domain Based Approach to Information Retrieval in Digital Libraries - Rotel...University of Bari (Italy)
The current abundance of electronic documents requires automatic techniques that support the users in understanding their content and extracting useful information. To this aim, improving the retrieval performance must necessarily go beyond simple lexical interpretation of the user queries, and pass through an understanding of their semantic content and aims. It goes without saying that any digital library would take enormous advantage from the availability of effective Information Retrieval techniques to provide to their users. This paper proposes an approach to Information Retrieval based on a correspondence of the domain of discourse between the query and the documents in the repository. Such an association is based on standard general-purpose linguistic resources (WordNet and WordNet Domains) and on a novel similarity assessment technique. Although the work is at a preliminary stage, interesting initial results suggest to go on extending and improving the approach.
Advance Clustering Technique Based on Markov Chain for Predicting Next User M...idescitation
According to the survey India is one of the
leading countries in the word for technical education and
management education. Numbers of students are increasing
day by day by the growth rate of 45% per annum. Advancement
in technology puts special effect on education system. This
helps in upgrading higher education. Some universities and
colleges are using these technologies. Weblog is one of them.
Main aim of this paper is to represent web logs using clustering
technique for predicting next user movement and user
behavior analysis. This paper moves around the web log
clustering technique based on Markov chain results .In this
paper we present an ideal approach to web clustering
(clustering web site users) and predicting their behavior for
next visit. Methodology: For generating effective result approx
14 engineering college web usage data is used and an advance
clustering approach is presenting after optimizing the other
clustering approach.Results: The user behavior is predicted
with the help of the advance clustering approach based on the
FPCM and k-mean. Proposed algorithm is used to mined and
predict user’s preferred paths. To predict the user behavior
existing approaches have been used. But the existing
approaches are not enough because of its reaction towards
noise. Thus with the help of ACM, noise is reduced, provides
more accurate result for predicting the user behavior. Approach
Implementation:The algorithm was implemented in MAT
LAB, DTRG and in Java .The experiment result proves that
this method is very effective in predicting user behavior. The
experimental results have validated the method’s effectiveness
in comparison with some previous studies.
International Journal of Engineering Inventions (IJEI) provides a multidisciplinary passage for researchers, managers, professionals, practitioners and students around the globe to publish high quality, peer-reviewed articles on all theoretical and empirical aspects of Engineering and Science.
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
Context Driven Technique for Document ClassificationIDES Editor
In this paper we present an innovative hybrid Text
Classification (TC) system that bridges the gap between
statistical and context based techniques. Our algorithm
harnesses contextual information at two stages. First it extracts
a cohesive set of keywords for each category by using lexical
references, implicit context as derived from LSA and wordvicinity
driven semantics. And secondly, each document is
represented by a set of context rich features whose values are
derived by considering both lexical cohesion as well as the extent
of coverage of salient concepts via lexical chaining. After
keywords are extracted, a subset of the input documents is
apportioned as training set. Its members are assigned categories
based on their keyword representation. These labeled
documents are used to train binary SVM classifiers, one for
each category. The remaining documents are supplied to the
trained classifiers in the form of their context-enhanced feature
vectors. Each document is finally ascribed its appropriate
category by an SVM classifier.
INFORMATION RETRIEVAL TECHNIQUE FOR WEB USING NLP ijnlc
Information retrieval is becoming an intricate part of every domain. Be it in acquiring data from various sources to form a single unit or to present the data in such a way that anyone can extract useful information and hence used in data analysis, data mining etc. This arena has gained much importance in the recent years because as of today we are exploded with various kind of information from the real-world. The growing importance of research data and retrieving the intelligent data are the main focus for any business today. So coming years this is a field where major work need to be done. We have focused here to implement a system for information retrieval from the webpages using Natural Language Processing (NLP) and have shown to getting better results than the existing system. Webpages is a home to huge amount of information from various entities in the real-world. Here we have designed a system for information retrieval technique for web using NLP where techniques Hierarchical Conditional Random Fields (i.e. HCRF) and extended Semi-Markov Conditional Random Fields (i.e. Semi-CRF) along with Visual Page Segmentation is used to get the accurate results. Also parallel processing is used to achieve the results in desired time frame. It further improves the decision making between HCRF and Semi-CRF by using bidirectional approach rather than top-down approach. It enables better understanding of the content and page structure.
The classic mutual exclusion problem in distributed systems occurs when only one process should access a
shared resource. Various algorithms are proposed in order to solve this problem. When using a permission
based approach which consist in exchanging permission messages to grant access to the critical resource,
less messages should be sent over the network because bandwidth consumption and synchronization delay
should be reduced. Richa, shikha and Pooja proposed an algorithm using nodes logically organized in a
complete binary tree. This algorithm called NTBCBT requires 4log2(N) messages per access to critical
section and a synchronization delay of 3log2(N) for a set of N nodes competing for the critical ressource. In
this paper, we study NTBCBT and we show that this algorithm has problems related with safety, liveness
and scheduling. We improve this algorithm by correcting these weaknesses. Moreover, our algorithm
requires 3log(N) messages per access to critical section and a synchronization delay of 2log(N). This
improvement is due to the removal of useless messages, a reorganization of instructions on each node and
an insertion of access requests using their timestamp.
This is a complete automated solution for the existing energy distribution and monitoring system in
India,which can monitor the meter readings continuously and take necessary actions to maintain the power
grid stable. A Power Line Communication (PLC) based modem is integrated with each electronic energy
meter. Through PLC the meters communicate with the coordinator. Coordinator makes use of GPRS modem
to upload/download data to/from internet. A personal computer with an internet connection at the other end,
which contains the database acts as the billing point. Live meter reading sent back to this billing point
periodically and these details are updated in a central database. An interactive, user friendly graphical
interface is present at user end. All the energy logs, notices from the Government, billing details and average
statistics will be available here. The system splits the loads into critical loads and non critical loads. This
makes the distribution system more intelligent. More over prior information about the power cuts can be
done. We can easily implement many add-ons such as energy demand prediction, real time dynamic tariff as
a function of demand and supply and so on.
Wireless sensor network are emerging in various fields like environmental monitoring, mining, surveillance
system, medical monitoring. LEACH protocol is one of the predominantly used clustering routing protocols
in wireless sensor networks. In Leach each node has equal chance to become a cluster head which make
the energy dissipated of each node be moderately balanced. We have pioneered an improved algorithm
named as Novel Leach based on Leach protocol. The proposed algorithm shows the significant
improvement in network lifetime .Comparison of proposed algorithm is done with basic leach in terms of
network life time, cluster head selection, energy consumption, and data transmission to base station. The
simulation results shows that proposed algorithm can reduce network energy consumption and prolong
network life commendably. Simulation of our protocol is done with Matlab.
Diseño de placa plana cuadrada, concreto reforzado IIEnrique Santana
Diseñe una losa plana cuadrada con las siguientes consideraciones:
- Columnas de 48 cm x 48 cm, altura de 4.6 m.
- Resistencia del concreto: f_c^'=280 kg⁄〖cm〗^2 .
- Resistencia del acero: f_y=4,200 kg⁄〖cm〗^2 .
In recent years, research efforts tried to exploit peer-to-peer (P2P) systems in order to provide Live Streaming (LS) and Video-on-Demand (VoD) services. Most of these research efforts focus on the development of distributed P2P block schedulers for content exchange among the participating peers and on the characteristics of the overlay graph (P2P overlay) that interconnects the set of these peers.Currently, researchers try to combine peer-to-peer systems with cloud infrastructures. They developed monitoring and control architectures that use resources from the cloud in order to enhance QoS and achieve an attractive trade-off between stability and low cost operation. However, there is a lack of
research effort on the congestion control of these systems and the existing congestion control architectures are not suitable for P2P live streaming traffic (small sequential non persistent traffic towards multiple network locations). This paper proposes a P2P live streaming traffic aware congestion control protocol that: i) is capable to manage sequential traffic heading to multiple network destinations , ii) efficiently exploits the available bandwidth, iii) accurately measures the idle peer resources, iv) avoids network congestion, and v) is friendly to traditional TCP generated traffic.The proposed P2P congestion control has been implemented, tested and evaluated through a series of real experiments powered across the BonFIRE infrastructure.
Load balancing is one of the main challenges of every structured peer-to-peer (P2P) system that uses
distributed hash tables to map and distribute data items (objects) onto the nodes of the system. In a typical
P2P system with N nodes, the use of random hash functions for distributing keys among peer nodes can
lead to O(log N) imbalance. Most existing load balancing algorithms for structured P2P systems are not
adaptable to objects’ variant loads in different system conditions, assume uniform distribution of objects in
the system, and often ignore node heterogeneity. In this paper we propose a load balancing algorithm that
considers the above issues by applying node movement and replication mechanisms while load balancing.
Given the high overhead of replication, we postpone this mechanism as much as possible, but we use it
when necessary. Simulation results show that our algorithm is able to balance the load within 85% of the
optimal value.
The classic mutual exclusion problem in distributed systems occurs when only one process should access a
shared resource. Various algorithms are proposed in order to solve this problem. When using a permission
based approach which consist in exchanging permission messages to grant access to the critical resource,
less messages should be sent over the network because bandwidth consumption and synchronization delay
should be reduced. Richa, shikha and Pooja proposed an algorithm using nodes logically organized in a
complete binary tree. This algorithm called NTBCBT requires 4log2(N) messages per access to critical
section and a synchronization delay of 3log2(N) for a set of N nodes competing for the critical ressource. In
this paper, we study NTBCBT and we show that this algorithm has problems related with safety, liveness
and scheduling. We improve this algorithm by correcting these weaknesses. Moreover, our algorithm
requires 3log(N) messages per access to critical section and a synchronization delay of 2log(N). This
improvement is due to the removal of useless messages, a reorganization of instructions on each node and
an insertion of access requests using their timestamp.
Método equivalente lineal, espectro y amplificaciónEnrique Santana
Análisis de sitio en el campus de la UNAN, Managua a través de espectros elásticos de respuesta generados por datos de 5 terremotos utilizando el software “DEEPSOIL Versión 6.0.6.1”.
In this paper, we have proposed a hybrid push-pull protocol for peer-to-peer live video streaming. The
main goal of this research is to minimize the network end-to-end delay in comparison to pure mesh
networks. Hybrid protocols, in most cases, suffer from complex construction and maintenance. Therefore,
our proposed protocol uses a pure mesh topology and a single layer video coding. In summary, our pushpull
protocol has two parts. The pull-based part which is done on the mesh network, and the push-based
part which consists of two phases: parent selection and tree construction. When a push procedure appears,
it is very important to prevent data redundancy. To satisfy this condition, we have introduced a parent
selection method. In this method, by parent selection based on the minimum arrival time, the most stable
node will be selected. This node has the advantage of maximizing the expected service time of the tree.
Using this method, there is no need for maintaining any extra information and topology control data.
Finally, we do performance evaluation using OMNeT++ simulator. The simulation results show that the
proposed architecture has better performance in start-up delay, end-to-end delay, and distortion than pure
mesh-based network.
PERFORMANCE ANALYSIS AND COMPARISON OF IMPROVED DSR WITH DSR, AODV AND DSDV R...ijp2p
Mobile Ad-hoc networks are categorized by multi-hop wireless connectivity and numbers of nodes are connecting each other through wireless network. It includes several routing protocols specifically designed for ad-hoc routing. The most widely used ad hoc routing protocols are Ad-hoc On Demand Distance Vector (AODV), Destination Sequence Distance Vector (DSDV), and Dynamic Source Routing (DSR). In this paper, we present an analysis of DSR protocol and propose our algorithm to improve the performance of DSR protocol by using small delay applied on last route ACK path when an original route fails in Mobile Ad Hoc networks. Past researchers the MANET have focused on simulation study by varying network parameters, such as network size, number of nodes. The simulation results shows that the M-DSR protocol
having some excellent performance Metrics then other protocols. We have taken different performance parameters over the comparison of Modified -DSR with other three protocols in mobility as well as Nonmobility scenario up to 300 nodes in MANETs using NS2 simulator. To achieve this goal DSR is modified by using modified algorithm technique in order to load balancing, to avoid congestion and lower packet
delivery.
Our area of interest for the paper is the improvement of performance of DSR routing protocol by
changing in algorithm and this Improved DSR protocol should compare with remaining protocols
taken in this research paper.
2. In this paper we made changesin traditional DSR protocol and generation of new improved DSR the
different performance parameters and compare with AODV/DSR/DSDV protocols in mobility and
non- mobility scenarios nodes up to 300.
3. We can plot the graphs throughput, End to end Delay, Packet delivery Ratio, Dropping Ratio, and
average energy consumption on Mobility and Non-Mobility scenario by using Network Simulator
version 2.34 for Modified DSR protocols. M-DSR, DSDV perform well when Mobility is low.
A Cooperative Peer Clustering Scheme for Unstructured Peer-to-Peer Systemsijp2p
This paper proposes a peer clustering scheme for unstructured Peer-to-Peer (P2P) systems. The proposed
scheme consists of an identification of critical links, local reconfiguration of incident links, and a
retaliation rule. The simulation result indicates that the proposed scheme improves the performance of
previous schemes and that a peer taking a cooperative action will receive a higher profit than selfish peers.
Análisis de vibración ambiental en el edificio del instituto de geología y ge...Enrique Santana
En la siguiente monografía, se presenta la técnica de cociente espectral con el método de Nakamura, aplicada al edificio del IGG-CIGEO de la UNAN-Managua (Nicaragua). Se espera que este trabajo sea de utilidad a todo el que tenga deseos de aprender. El conocimiento debe ser libre e igual para todos.
Diseño de puente mixto (losa de concreto y vigas de acero)Enrique Santana
Entonces como el cortante máximo es τ_MÁXIMO=4.710 ksi, menor al cortante permisible τ_PERM=12 ksi, por tanto la viga puede soportar la carga por cortante. Debido a ello no se recomienda usar atiesador para la viga, pues incurre en gastos poco necesarios; en el caso que se haga el diseño para un vehículo de mayor peso, se deben revisar los cortantes y momentos máximos.
Semantic Search of E-Learning Documents Using Ontology Based Systemijcnes
The keyword searching mechanism is traditionally used for information retrieval from Web based systems. However, this system fails to meet the requirements in Web searching of the expert knowledge base based on the popular semantic systems. Semantic search of E-learning documents based on ontology is increasingly adopted in information retrieval systems. Ontology based system simplifies the task of finding correct information on the Web by building a search system based on the meaning of keyword instead of the keyword itself. The major function of the ontology based system is the development of specification of conceptualization which enhances the connection between the information present in the Web pages with that of the background knowledge.The semantic gap existing between the keyword found in documents and those in query can be matched suitably using Ontology based system. This paper provides a detailed account of the semantic search of E-learning documents using ontology based system by making comparison between various ontology systems. Based on this comparison, this survey attempts to identify the possible directions for future research.
A QUERY LEARNING ROUTING APPROACH BASED ON SEMANTIC CLUSTERSijait
Peer-to-peer systems have recently a remarkable success in the social, academic, and commercial communities. A fundamental problem in Peer-to-Peer systems is how to efficiently locate appropriate peers to answer a specific query (Query Routing Problem). A lot of approaches have been carried out to enhance search result quality as well as to reduce network overhead. Recently, researches focus on methods based on query-oriented routing indices. These methods utilize the historical information of past queries and query hits to build a local knowledge base per peer, which represents the user's interests or profile. When a peer forwards a given query, it evaluates the query against its local knowledge base in order to select a set of relevant peers to whom the query will be routed. Usually, an insufficient number of relevant peers is selected from the current peer's local knowledge base thus a broadcast search is investigated which badly affects the approach efficiency. To tackle this problem, we introduce a novel method that clusters peers having similar interests. It exploits not only the current peer's knowledge base but also that of the others in
the cluster to extract relevant peers. We implemented the proposed approach, and tested (i) its retrieval effectiveness in terms of recall and precision, (ii) its search cost in terms of messages traffic and visited peers number. Experimental results show that our approach improves the recall and precision metrics while reducing dramatically messages traffic.
Existing work on keyword search relies on an element-level model (data graphs) to compute keyword query results.Elements mentioning keywords are retrieved from this model and paths between them are explored to compute Steiner graphs. KRG (keyword Element Relationship Graph) captures relationships at the keyword level.Relationships captured by a KRG are not direct edges between tuples but stand for paths between keywords.
We propose to route keywords only to relevant sources to reduce the high cost of processing keyword search queries over all sources. A multilevel scoring mechanism is proposed for computing the relevance of routing plans based on scores at the level of keywords, data elements,element sets, and subgraphs that connect these elements
Performance Evaluation of Query Processing Techniques in Information Retrievalidescitation
The first element of the search process is the query.
The user query being on an average restricted to two or three
keywords makes the query ambiguous to the search engine.
Given the user query, the goal of an Information Retrieval
[IR] system is to retrieve information which might be useful
or relevant to the information need of the user. Hence, the
query processing plays an important role in IR system.
The query processing can be divided into four categories
i.e. query expansion, query optimization, query classification and
query parsing. In this paper an attempt is made to evaluate the
performance of query processing algorithms in each of the
category. The evaluation was based on dataset as specified by
Forum for Information Retrieval [FIRE15]. The criteria used
for evaluation are precision and relative recall. The analysis is
based on the importance of each step in query processing. The
experimental results show that the significance of each step
in query processing and also the relevance of web semantics
and spelling correction in the user query.
SEMANTIC INFORMATION EXTRACTION IN UNIVERSITY DOMAINcscpconf
Today’s conventional search engines hardly do provide the essential content relevant to the
user’s search query. This is because the context and semantics of the request made by the user
is not analyzed to the full extent. So here the need for a semantic web search arises. SWS is
upcoming in the area of web search which combines Natural Language Processing and
Artificial Intelligence.
The objective of the work done here is to design, develop and implement a semantic search
engine- SIEU(Semantic Information Extraction in University Domain) confined to the
university domain. SIEU uses ontology as a knowledge base for the information retrieval
process. It is not just a mere keyword search. It is one layer above what Google or any other
search engines retrieve by analyzing just the keywords. Here the query is analyzed both
syntactically and semantically.
The developed system retrieves the web results more relevant to the user query through keyword
expansion. The results obtained here will be accurate enough to satisfy the request made by the
user. The level of accuracy will be enhanced since the query is analyzed semantically. The
system will be of great use to the developers and researchers who work on web. The Google results are re-ranked and optimized for providing the relevant links. For ranking an algorithm has been applied which fetches more apt results for the user query
Privacy Preserved Distributed Data Sharing with Load Balancing SchemeEditor IJMTER
Data sharing services are provided under the Peer to Peer (P2P) environment. Federated
database technology is used to manage locally stored data with a federated DBMS and provide unified
data access. Information brokering systems (IBSs) are used to connect large-scale loosely federated data
sources via a brokering overlay. Information brokers redirect the client queries to the requested data
servers. Privacy preserving methods are used to protect the data location and data consumer. Brokers are
trusted to adopt server-side access control for data confidentiality. Query and access control rules are
maintained with shared data details under metadata. A Semantic-aware index mechanism is applied to
route the queries based on their content and allow users to submit queries without data or server
information.
Distributed data sharing is managed with Privacy Preserved Information Brokering (PPIB)
scheme. Attribute-correlation attack and inference attacks are handled by the PPIB. PPIB overlay
infrastructure consisting of two types of brokering components, brokers and coordinators. The brokers
acts as mix anonymizer are responsible for user authentication and query forwarding. The coordinators
concatenated in a tree structure, enforce access control and query routing based on the automata.
Automata segmentation and query segment encryption schemes are used in the Privacy-preserving
Query Brokering (QBroker). Automaton segmentation scheme is used to logically divide the global
automaton into multiple independent segments. The query segment encryption scheme consists of the
preencryption and postencryption modules.
The PPIB scheme is enhanced to support dynamic site distribution and load balancing
mechanism. Peer workloads and trust level of each peer are integrated with the site distribution process.
The PPIB is improved to adopt self reconfigurable mechanism. Automated decision support system for
administrators is included in the PPIB.
Study on security and quality of service implementations in p2 p overlay netw...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
To Get any Project for CSE, IT ECE, EEE Contact Me @ 09666155510, 09849539085 or mail us - ieeefinalsemprojects@gmail.com-Visit Our Website: www.finalyearprojects.org
COMPARATIVE STUDY OF CAN, PASTRY, KADEMLIA AND CHORD DHTSijp2p
Peer-to-Peer (P2P) systems allow decentralization, sharing of all the resources of a network with direct
communication and collaboration between nodes. There are three main families of P2P networks: the
centralized architecture, the decentralized architecture that can be structured or unstructured and the
hybrid architecture. Today, there are several implementations for structured decentralized architectures.
This implies that the insertion and search algorithms are different. Among them we have; Chord, Pastry,
Kademlia, CAN(Content Addressable Network) . The choice of these DHTs (Distributed Hash Table) for an
application is made on the basis of their performances. Studies of each of these DHTs mentioned have been
done, proving their performance. But a comparative study of the four DHTs Chord, Pastry, CAN, Kademlia
has not been clearly addressed by previous works. In this paper, we have conducted a comparative
theoretical study of the DHTs Chord, Pastry, CAN, Kademlia. Then, by simulation, we have evaluated the
performances in terms of latency, number of hops and number of transmitted messages. Our study clearly
shows the differences between mathematically established performance and actual performance in an
environment with less restriction. This analysis was made from the data obtained by using the simple
network layer of the PeerfactSim simulator. This simulator abstracts the different network layers, which
gives the advantage of testing the performances with reasonable accuracy. The use of the single network
layer can be considered an ideal case because the node searches are done locally.
International Journal of Peer to Peer Networks .docxijp2p
The International Journal of peer-to-peer networking is a quarterly open access peer-reviewed journal that publishes articles that contribute new results in all areas of P2P Networks. The journal provides a platform to disseminate new ideas and new research, advance theories, and propagate best practices in the area of P2P networking. This will include works that relate to peer-to-peer systems, peer-to-peer applications, grid systems, large-scale distributed systems, and overlay networks. The journal offers a forum in which academics, consultants, and practitioners in a variety of fields can exchange ideas to further research and improve practices in all areas of P2P.
Authors are solicited to contribute to the journal by submitting articles that illustrate research results, projects, surveying works and industrial experiences that describe significant advances in the areas of P2P networks.
International Journal of peer-to-peer networks (IJP2P)ijp2p
The International Journal of peer-to-peer networking is a quarterly open access peer-reviewed journal that publishes articles that contribute new results in all areas of P2P Networks. The journal provides a platform to disseminate new ideas and new research, advance theories, and propagate best practices in the area of P2P networking.
International Journal of peer-to-peer networks (IJP2P)ijp2p
The International Journal of peer-to-peer networking is a quarterly open access peer-reviewed journal that publishes articles that contribute new results
in all areas of P2P Networks. The journal provides a platform to disseminate new ideas and new research, advance theories, and propagate best
practices in the area of P2P networking. This will include works that relate to peer-to-peer systems, peer-to-peer applications, grid systems,
large-scale distributed systems, and overlay networks. The journal offers a forum in which academics, consultants, and practitioners in a variety
of fields can exchange ideas to further research and improve practices in all areas of P2P.
2nd International Conference on Big Data, IoT and Machine Learning (BIOM 2022)ijp2p
2nd International Conference on Big Data, IoT and Machine Learning (BIOM 2022) will act as a major forum for the presentation of innovative ideas, approaches, developments, and research projects in the areas of Big Data, Internet of Things (IoT) and Machine Learning. It will also serve to facilitate the exchange of information between researchers and industry professionals to discuss the latest issues and advancement in the area of Big Data, IoT and Machine Learning.
7th International Conference on Networks, Communications, Wireless and Mobile...ijp2p
7th International Conference on Networks, Communications, Wireless and Mobile Computing (NCWMC 2022) looks for significant contributions to the Computer Networks, Communications, wireless and mobile computing for wired and wireless networks in theoretical and practical aspects. Original papers are invited on computer Networks, network protocols and wireless networks, Data communication Technologies, network security and mobile computing. The goal of this Conference is to bring together researchers and practitioners from academia and industry to focus on advanced networking concepts and establishing new collaborations in these areas.
4th International Conference on Internet of Things (CIoT 2022)ijp2p
4th International Conference on Internet of Things (CIoT 2022) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of IoT.
11th International conference on Parallel, Distributed Computing and Applicat...ijp2p
11th International conference on Parallel, Distributed Computing and Applications (IPDCA 2022) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Parallel, Distributed Computing. Original papers are invited on Algorithms and Applications, computer Networks, Cyber trust and security, Wireless networks and mobile Computing and Bioinformatics. The aim of the conference is to provide a platform to the researchers and practitioners from both academia as well as industry to meet and share cutting-edge development in the field.
3rd International Conference on Machine learning and Cloud Computing (MLCL 2022)ijp2p
3rd International Conference on Machine learning and Cloud Computing (MLCL 2022) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Cloud computing. The aim of the conference is to provide a platform to the researchers and practitioners from both academia as well as industry to meet and share cutting-edge development in the field.
4th International Conference on Internet of Things (CIoT 2022) ijp2p
4th International Conference on Internet of Things (CIoT 2022) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of IoT.
4th International Conference on Internet of Things (CIoT 2022) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of IoT.
International Journal of peer-to-peer networks (IJP2P)ijp2p
The International Journal of peer-to-peer networking is a quarterly open access peer-reviewed journal that publishes articles that contribute new results in all areas of P2P Networks. The journal provides a platform to disseminate new ideas and new research, advance theories, and propagate best practices in the area of P2P networking. This will include works that relate to peer-to-peer systems, peer-to-peer applications, grid systems, large-scale distributed systems, and overlay networks. The journal offers a forum in which academics, consultants, and practitioners in a variety of fields can exchange ideas to further research and improve practices in all areas of P2P.
3rd International Conference on Networks, Blockchain and Internet of Things (...ijp2p
3rd International Conference on Networks, Blockchain and Internet of Things (NBIoT 2022) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Networks, Blockchain and Internet of Things. The Conference looks for significant contributions to all major fields of the Networks, Blockchain and Internet of Things in theoretical and practical aspects.
Authors are solicited to contribute to the conference by submitting articles that illustrate research results, projects, surveying works and industrial experiences that describe significant advances in the following areas but are not limited to:
3rd International Conference on NLP & Information Retrieval (NLPI 2022)ijp2p
3rd International Conference on NLP & Information Retrieval (NLPI 2022) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Natural Language Computing and Information Retrieval.
Authors are solicited to contribute to the conference by submitting articles that illustrate research results, projects, surveying works and industrial experiences that describe significant advances in the following areas, but are not limited to.
CALL FOR PAPERS - 14th International Conference on Wireless & Mobile Networks...ijp2p
14th International Conference on Wireless & Mobile Networks (WiMoNe 2022) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Wireless & Mobile computing Environment. Current information age is witnessing a dramatic use of digital and electronic devices in the workplace and beyond. Wireless, Mobile Networks & its applications had received a significant and sustained research interest in terms of designing and deploying large scale and high performance computational applications in real life. The aim of the conference is to provide a platform to the researchers and practitioners from both academia as well as industry to meet and share cutting-edge development in the field.
PUBLISH YOUR PAPER - INTERNATIONAL JOURNAL OF PEER-TO-PEER NETWORKS (IJP2P)ijp2p
Authors are solicited to contribute to the journal by submitting articles that illustrate research results, projects, surveying works and industrial experiences that describe significant advances in the areas of P2P networks.
International Journal of peer-to-peer networks (IJP2P)ijp2p
The International Journal of peer-to-peer networking is a quarterly open access peer-reviewed journal that publishes articles that contribute new results in all areas of P2P Networks. The journal provides a platform to disseminate new ideas and new research, advance theories, and propagate best practices in the area of P2P networking. This will include works that relate to peer-to-peer systems, peer-to-peer applications, grid systems, large-scale distributed systems, and overlay networks. The journal offers a forum in which academics, consultants, and practitioners in a variety of fields can exchange ideas to further research and improve practices in all areas of P2P.
3rd International Conference on Blockchain and Internet of Things (BIoT 2022)ijp2p
3rd International Conference on Blockchain and Internet of Things (BIoT 2022) will provide an excellent international forum for sharing knowledge and results in theory, methodology and
applications of Blockchain and Internet of Things. The Conference looks for significant contributions to all major fields of the Blockchain and Internet of Things in theoretical and
practical aspects. Authors are solicited to contribute to the conference by submitting articles that illustrate research
results, projects, surveying works and industrial experiences that describe significant advances in the areas of Blockchain and Internet of Things.
International Journal of peer-to-peer networks (IJP2P)ijp2p
The International Journal of peer-to-peer networking is a quarterly open access peer-reviewed journal that publishes articles that contribute new results in all areas of P2P Networks. The journal provides a platform to disseminate new ideas and new research, advance theories, and propagate best practices in the area of P2P networking. This will include works that relate to peer-to-peer systems, peer-to-peer applications, grid systems, large-scale distributed systems, and overlay networks. The journal offers a forum in which academics, consultants, and practitioners in a variety of fields can exchange ideas to further research and improve practices in all areas of P2P.
CALL FOR PAPERS - 4th International Conference on Internet of Things (CIoT 2022)ijp2p
4th International Conference on Internet of Things (CIoT 2022) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of IoT.
Authors are solicited to contribute to the conference by submitting articles that illustrate research results, projects, surveying works and industrial experiences that describe significant advances in the areas of Internet of Things.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
We have compiled the most important slides from each speaker's presentation. This year’s compilation, available for free, captures the key insights and contributions shared during the DfMAy 2024 conference.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
NUMERICAL SIMULATIONS OF HEAT AND MASS TRANSFER IN CONDENSING HEAT EXCHANGERS...ssuser7dcef0
Power plants release a large amount of water vapor into the
atmosphere through the stack. The flue gas can be a potential
source for obtaining much needed cooling water for a power
plant. If a power plant could recover and reuse a portion of this
moisture, it could reduce its total cooling water intake
requirement. One of the most practical way to recover water
from flue gas is to use a condensing heat exchanger. The power
plant could also recover latent heat due to condensation as well
as sensible heat due to lowering the flue gas exit temperature.
Additionally, harmful acids released from the stack can be
reduced in a condensing heat exchanger by acid condensation. reduced in a condensing heat exchanger by acid condensation.
Condensation of vapors in flue gas is a complicated
phenomenon since heat and mass transfer of water vapor and
various acids simultaneously occur in the presence of noncondensable
gases such as nitrogen and oxygen. Design of a
condenser depends on the knowledge and understanding of the
heat and mass transfer processes. A computer program for
numerical simulations of water (H2O) and sulfuric acid (H2SO4)
condensation in a flue gas condensing heat exchanger was
developed using MATLAB. Governing equations based on
mass and energy balances for the system were derived to
predict variables such as flue gas exit temperature, cooling
water outlet temperature, mole fraction and condensation rates
of water and sulfuric acid vapors. The equations were solved
using an iterative solution technique with calculations of heat
and mass transfer coefficients and physical properties.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
An Approach to Detecting Writing Styles Based on Clustering Techniquesambekarshweta25
An Approach to Detecting Writing Styles Based on Clustering Techniques
Authors:
-Devkinandan Jagtap
-Shweta Ambekar
-Harshit Singh
-Nakul Sharma (Assistant Professor)
Institution:
VIIT Pune, India
Abstract:
This paper proposes a system to differentiate between human-generated and AI-generated texts using stylometric analysis. The system analyzes text files and classifies writing styles by employing various clustering algorithms, such as k-means, k-means++, hierarchical, and DBSCAN. The effectiveness of these algorithms is measured using silhouette scores. The system successfully identifies distinct writing styles within documents, demonstrating its potential for plagiarism detection.
Introduction:
Stylometry, the study of linguistic and structural features in texts, is used for tasks like plagiarism detection, genre separation, and author verification. This paper leverages stylometric analysis to identify different writing styles and improve plagiarism detection methods.
Methodology:
The system includes data collection, preprocessing, feature extraction, dimensional reduction, machine learning models for clustering, and performance comparison using silhouette scores. Feature extraction focuses on lexical features, vocabulary richness, and readability scores. The study uses a small dataset of texts from various authors and employs algorithms like k-means, k-means++, hierarchical clustering, and DBSCAN for clustering.
Results:
Experiments show that the system effectively identifies writing styles, with silhouette scores indicating reasonable to strong clustering when k=2. As the number of clusters increases, the silhouette scores decrease, indicating a drop in accuracy. K-means and k-means++ perform similarly, while hierarchical clustering is less optimized.
Conclusion and Future Work:
The system works well for distinguishing writing styles with two clusters but becomes less accurate as the number of clusters increases. Future research could focus on adding more parameters and optimizing the methodology to improve accuracy with higher cluster values. This system can enhance existing plagiarism detection tools, especially in academic settings.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
1. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
DOI : 10.5121/ijp2p.2015.6102 11
BSI: BLOOM FILTER-BASED SEMANTIC INDEXING
FOR UNSTRUCTURED P2P NETWORKS
Rasanjalee Dissanayaka1
, Sushil K Prasad2
, Shamkant B. Navathe3
and Vikram
Goyal4
1
Department of Computer Science, College of Saint Benedict and Saint John's University,
Collegeville, Minnesota USA
2
Department of Computer Science, Georgia State University, Atlanta, Georgia USA
3
College of Computing, Georgia Institute of Technology, Atlanta, Georgia USA
4
IIIT-Delhi, New Delhi, India
ABSTRACT
Resource management and search is very important yet challenging in large-scale distributed systems like
P2Pnetworks. Most existing P2P systems rely on indexing to efficiently route queries over the network.
However, searches based on such indices face two key issues. First, majority of existing search schemes
often rely on simply keyword based indices that can only support exact string based matches without taking
into account the meaning of words. Second it is difficult, if not impossible, to devise query based indexing
schemes that can represent all possible concept combinations without resulting in exponential index sizes.
To address these problems, we present BSI, a novel P2P indexing and query routing strategy to support
semantic based content searches. The BSI indexing structure captures the semantic content of documents
using a reference ontology. Our indexing scheme can efficiently handle multi-concept queries by
maintaining summary level information for each individual concept and concept combinations using a
novel space-efficient Two-level Semantic Bloom Filter(TSBF) data structure. By using TSBFs to represent
a large document and query base, BSI significantly reduces the communication cost and storage cost of
indices. Furthermore, We devise a low-overhead mechanism to allow peers to dynamically estimate the
relevance strength of a peer for multi-concept queries with high accuracy solely based on TSBFs. We also
propose a routing index compression mechanism to observe peers’ dynamic storage limitations with
minimal loss of information by exploiting a reference ontology structure. Based on the proposed index
structure, we design a novel query routing algorithm that exploits semantic based information to route
queries to semantically relevant peers. Performance evaluation demonstrates that our proposed approach
can improve the search recall of unstructured P2P systems up to 383.71% while keeping the
communication cost at a low level compared to state-of-art search mechanism OSQR [7].
KEYWORDS
Unstructured P2P Networks, P2P Search, Query Routing, Bloom Filters, Semantic Search
1. INTRODUCTION
P2P systems have become a popular means of sharing large amounts of data among users over
recent years. They are considered an attractive solution for applications requiring high scalability,
robustness and autonomy. Efficient resource discovery in basic unstructured P2P systems,
however, suffers from high costs mainly due to lack of global knowledge. To make matters
worse, with the advent of semantic web, more and more users associate their shared resources
2. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
12
with semantic meta-information. This mandates that the P2P networks provide methods that allow
a user not only to describe resources from a semantic viewpoint but also allow users to run more
complex queries addressing several semantic properties or relationships among semantic entities
and resources. Majority of search algorithms proposed for P2P networks to date, however, are
simple keyword searches or title based searches. These are inadequate for formulating semantic
based queries and consequently, do not provide semantically relevant results.
Index engineering is at the heart of P2P search methods. P2P indices can be broadly categorized
to local, central and distributed indices. In a local index, a peer only keeps references to its own
data. In a central index scheme such as one employed by Napster [1], a single server peer
maintains data about all the other peers in the network in its index. Most widely used scheme is
the distributed index scheme where peers maintain pointers towards targets in the network.
Current P2P indexing schemes face three main problems. First, overwhelming majority of
indexing schemes proposed to date are simple keyword based indices. They do not associate
semantics(i.e. meaning) to keywords in index resulting in poor search accuracy. Second, the
handful of semantic based indexing schemes [11], [13] are either computationally intensive [13]
or index size varies largely with the dimensionality of the content [13]. Third, maintaining query
based indices (as opposed to simple keyword based indices)in a space efficient manner is
challenging in P2P networks. Maintaining minimum routing state per peer is a crucial component
in P2P searching. However, maintaining small size query based routing indices while not
compromising the quality of information they hold is a challenging problem and has not been
addressed sufficiently by the current body of work.
Maintaining query based indices in P2P networks are attractive for two reasons: First, such an
index allows a peer to maintain more information about its neighborhood by allowing it to assign
a relevance strength to a neighbor per query. Second, as indicate by many web search logs [19]
majority of searches are multi-term(keyword or concept) queries that could benefit from query
based indices to achieve better performance. Many of the proposed work, however, are optimized
for single keyword queries. Handling multi-keyword queries is rather inefficient using single
keyword indices. On the other hand, maintaining multi-term query based indices for better
performance is simply impractical due to the resulting exponential size of indices. Little to no
explicit measures have been taken to ensure high quality of indices regardless of the query length
while maintaining small index sizes.
To address the aforementioned issues, in this paper we propose BSI, a semantics based indexing
framework, which aims to improve the quality and efficiency of search in P2P networks by using
a shared ontology as a reference for building index. Our indexing framework is designed for
unstructured P2P systems where network imposes no structure on the overlay network or data
placement. Our work addresses several important issues raised by current indexing schemes:
First, our semantic based index framework takes into account the meaning of words thereby
allowing peers to evaluate multi-concept queries. A reference ontology concepts serve as the
index terms and relevance strengths for different queries (concept combinations)are maintained in
the index. Each relevance strength is a combined measure of information content and the number
of relevant documents reachable through a given peer and its neighborhood. Second, we maintain
attractive small size index while maintaining the quality of index using Bloom filters. The size of
the routing index is limited by the number of concepts in the ontology and can be easily
compressed to accommodate space restrictions by utilizing hierarchical ancestor-descendant
relations in semantic ontology. Finally, we explicitly capture the notion of multi-concept query
based indexing in our index structure by allowing peers to maintain relevance strength for
different queries for each neighbor. To maintain large volume of multi-concept queries and their
3. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
13
corresponding relevance strength at a peer in a space efficient manner we introduce a novel
Bloom filter based data structure called Two-level Semantic Bloom Filters(TSBF). TSBF is an
extension to the traditional Bloom filter which incorporates the notions of ontology based meta
data and relevance strengths of queries. TSBF allows us to limit the size of a routing index while
maintaining large amount of summarized information regarding peers’ strengths for possible
queries to make a informed decision in routing multi-concept queries. Furthermore, We devise a
low-overhead mechanism to allow peers to dynamically estimate the relevance strength of a
neighbor for multi-term queries with high accuracy using TSBFs. We also propose a index
compression mechanism to observe dynamic storage limitations of peers with minimal loss of
information by exploiting the hierarchical relationships in the reference ontology structure.
Finally, based on the proposed indexing scheme, we design a novel query routing algorithm that
exploits semantic based information with different granularity to route queries to semantically
relevant peers.
In BSI, we do not employ computationally expensive methods like LSI. Rather, we use concepts
in a reference ontology to build peer local and routing indices to aid in the query routing process.
The process of constructing peer indices is much simpler and consumes less memory and time in
constructing them. Among many popular semantic indexing mechanisms OSQR [7], OLI [18],
Ontsum [11] The related work most comparable to ours is OSQR [7], a semantic search protocol
that maintains total relevant documents reachable through a neighbor for each concept from a
reference ontology in its routing index as routing state. The authors propose a mechanism to
calculate a conservative estimate of strength of a peer for a multi-concept query based on the
number of documents reachable from that peer for each individual query concepts. The
disadvantage of OSQR however is that its search cost is considerably high due its knowledge
dissemination mechanism which utilize search messages for piggybacking data.
Overall, many existing work for semantic P2P search lacks the ability to efficiently answer multi-
term queries with greater accuracy mainly due to lack of quality and quantity of routing state. BSI
gracefully handles multi-term queries by exploiting both the concepts in query as well as
structural relations of the query concepts in ontology to intelligently route a query to the most
promising peer. Our solution is scalable as the index structures can easily be compressed to
accommodate dynamic storage restrictions. Performance evaluation demonstrates that our
proposed approach can improve the search efficiency of unstructured P2P systems while keeping
the communication cost at a significantly lower level compared with state-of-art unstructured P2P
systems. The rest of the paper is organized as follows: We present the related work in Section 2 .
The preliminary concepts used in the paper are presented in Section 3. The P2P system
architecture including routing index construction and maintenance and query routing are given in
Section 4. In Section 5, we evaluate the proposed algorithms via simulation. Finally we conclude
and summarize our work in Section 6.
2. RELATED WORK
Many search mechanisms for P2P networks have been proposed in the literature over the past few
years. While some notable blind search mechanisms proposed for unstructured networks include
flooding [2] and random walk [12], popular informed search methods include Routing Indices
[21] and intelligent search [10]. However, these methods are intended for simple keyword
searches only.
Semantic Overlay Networks (SON) [6] is one of the earliest works in incorporating semantic
aspects into P2P file sharing applications. Here, authors propose a peer clustering approach based
on semantic content a peer shares. This clustering methodology is completely distributed and also
4. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
14
supports self-organization. However, in SON, peer joining process requires flooding the network
to request classification hierarchy and also for finding semantically related peers. P2P systems
such as Edutella [15] and InfoQuilt [3] use flooding as a basis for semantic search. Edutella
proposes an RDF metadata model and is a super-peer based mechanism. However, the notion of
super-peers makes the solution less attractive as super-peers need more resources than their sub-
peers to maintain all routing indices as well as RDF triples representing sub-peers. This also leads
to a substantial overhead in index updating at super-peers when a node joins, leaves, or changes
its content. Compared to this, our approach is much simpler and involves minimal overhead in
terms of local knowledge maintenance within peers. There is no notion of super peers and each
peer only keeps a simple semantic based Bloom filter (TSBF) per its neighbor. pSearch [20] uses
a semantic vector to describe and find the resources in the P2P network. SemreX [9] uses a
concept tree to construct a semantic overlay network on top of P2P overlay. Both pSearch and
SemreX use Latent Semantic Indexing (LSI) [5] to map documents to semantic vectors. LSI
however is computationally expensive, and when the network content changes frequently, this
method needs to be applied often. In BSI, we do not employ computationally expensive methods
like LSI. Rather, we use concepts in a reference ontology to build peer local and routing indices
to aid in the query routing process. The process of constructing peer indices is much simpler and
consumes less memory and time in constructing them. OntIndex [13] is a ontology based
indexing mechanism where authors propose a semantic based search algorithm for unstructured
P2P systems. However, the notion of semantic information content is not taken into account in
[13] resulting in low search performance. Also, index generation and updating in OntIndex is
resource consuming and generates considerable traffic. OSQR [7] is a ontology based semantic
search protocol which provides a rich semantic knowledge representation of peers by combining
the both total documents reachable from a peer and the semantic richness of the peer based on the
information content of the reachable documents. OSQR, however, heavily relies on knowledge
dissemination mechanism which utilize search messages for piggybacking data contributing to
increased search cost. OLI [18] is yet another ontology based indexing method for structured P2P
systems. Semantic search methods applied in structured networks such as in OLI however,
generally suffer from the shortcoming of structured overlays such as strict enforcement of data
placement and high maintenance cost of peers joining, and leaving as well as the cost of content
updating. To avoid these problems, we consider unstructured P2P overlay for our work. Ontsum
[11] is another ontology based indexing scheme where heterogeneous ontologies between peers
are assumed. This approach however requires computationally expensive ontology generation and
ontology mapping techniques to be employed in a distributed setting and allow only RDQL based
queries which requires some level of end user expertise to be used.
3. PRELIMINARIES
3.1. Network Topology
We consider an unstructured P2P network with a large set of peers {P1,P2,..., PP}. Each peer in the
network can only communicate with its direct neighbors in one hop distance. A peer is assumed
to have on the average γ number of neighboring peers. Peers may leave and join the network at
any time. Each peer has a local text document database that can be accessed through a local
index. The peer uses its local index to evaluate all the content queries and returns pointers to the
documents having the queried content.
5. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
15
3.2. Semantic Ontology
We assume the presence of a global reference semantic ontology which is known and agreed
upon by every peer in the network. The ontology is a set of m semantic concepts C={c1,c2,..., cm}
which are related to each other via IS-A relationship (i.e., hypernym/ hyponym). We denote all
the leaf concepts in the ontology by Cl .The root concept of the ontology is denoted by Cr.
3.3. Data Distribution
As stated earlier, each peer has a set of text documents. Each document is represented as a vector
of concepts from Cl using a Vector Space Model, a standard technique for representing document
contents in the area of information retrieval. Each document vector consists of a vector of concept
relevancy scores. To construct a document vector, the document is subjected to a process of text
mining followed by word sense disambiguation to identify those concepts that exist in the
document along with their concept frequencies(i.e. number of occurrences of a concept in the
document).Then the concept frequencies of each concept are normalized to a [0-1] range by
applying maximum frequency normalization by dividing them by the maximum concept
frequency for that concept encountered by the peer in its local document collection.
3.4. Semantic Query
A query consists of the conjunction or disjunction of one or more concepts from the leaf concepts
(Cl) of the ontology. Existing techniques [4], [5] proposed by the research community can be
employed to convert keyword queries to concept queries. A query returns a set of k documents
having its relevancy above some user defined threshold value. The relevancy of a document for a
given query is computed using concept vector similarity with respect to the query concepts. We
use the well known cosine similarity measure to calculate the similarity between a query Q and
document vector D:
݈ܵ݅݉݅ܽݕݐ݅ݎሺ,ܦ ܳሻ =
.ܦ ܳ
‖.‖ܦ ‖ܳ‖ (1)
A document is said to qualify as an answer for a given query if the cosine similarity between its
document vector and query exceeds a predefined relevance threshold.
3.5. Bloom Filters
A Bloom Filter (BF) is a data structure suitable for performing set membership queries very
efficiently. A Standard Bloom Filter representing a set S={s1,s2,..., sn} of n elements is generated
by an array of m bits and uses k independent hash functions h1,h2,...,hk. They are space efficient
data structures which provide constant time lookups and no false negatives. The downsides of
BFs are that they can result in false positives and do not allow item deletion. However, based on
the application requirement the false positive rate can be significantly lowered. Therefore care
must be taken in choosing k and m so that the false positive rate is acceptable. It has been shown
that the probability of a false positive Pfalse can be given by the following equation
݂݈ܲܽ݁ݏ = ሺ1 − ݁−݇݊/݉
ሻ݇
(2)
6. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
16
There are many variants of standard BF such as Spectral BF, counting BF, compressed BF, etc. In
our work we utilize both standard Bloom filters and Spectral Bloom Filters (SBF). SBFs is an
extension of the traditional BF for multi-sets allowing filtering of elements whose multiplicities
are below a threshold. SBFs allow querying on item multiplicities as well as deletion. SBFs
maintain a counter per bit in the bit array to keep track of the number of times the bit is set.
Simply taking the minimum count over all bits set for a given element will produce the
multiplicity of that element in the represented multi-set.
Figure 1. Ontology with seven concepts and IS-A relations between them.
(a) (b)
Figure 2. (a) A traditional Bloom filter with three hash functions. (b) A Spectral Bloom filter with three
hash functions.
4. SYSTEM DESIGN
Three key issues in current indexing strategies are (i) their inability to associate semantic of
represented content , (ii)low efficiency in routing multi-term queries and (ii)large index sizes. In
order to overcome the shortcomings of existing indexing strategies and semantic query evaluation
techniques, we put forward BSI(Bloom filter-based Semantic Indexing). We begin with the
design of the Two-level semantic Bloom Filter (TSBF), a space efficient data structure that
represents probabilistic information regarding contents a peer share with the network. We then
describe our routing index structure used for efficient query routing.
4.1. Two-level Semantic Bloom Filter (TSBF)
Bloom filters serve as appropriate index structures for resource discovery due to their scalability,
and low cost storage and distribution. However, they do not support semantic based multi-term
queries as they have no means of representing probabilistic information regarding ontological
data. To this end, we introduce a novel Bloom filter based data structure (TSBF) that aim at
supporting efficient conjunctive (and disjunctive) semantic query routing in unstructured P2P
networks.
7. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
17
TSBF is an extension of the traditional Bloom Filter to encode probabilistic information regarding
richness of a peer for different queries based on its local documents with the same false positive
probability as the traditional Bloom filters. In particular, it encodes goodness of a peer for
different multi-concept queries in terms of its distance to documents, the size of the document
collection and information content of these documents. Posed with the query "Is Query Q a
member of TSBFP ?" the TSBFP of peer P returns a relevance strength of P for Q based on
information encoded in it. In our work, we define this relevance strength of a peer to be the total
number of local documents in P for Q.
To implement this functionality TSBF is designed as a two-level Bloom filter where level 1
(TSBF1,P ) represents the set of qualified documents in P and level 2 represents the set of queries
answerable by P. In particular, for level 1 there exist multiple bit arrays each representing the
local documents of P rich in C (TSBF1,P (C)). These bit arrays are similar to those employed in
traditional Bloom filters and is supported by a sufficiently large body of research work [14], [16],
[17] that allows us to estimate number of documents reachable for a multi-concept query solely
based on these bit arrays.
Similar to level 1, level 2(TSBF2,P) also contains multiple bit arrays each representing different
multi-concept queries that whose concepts have C as the least common ancestor in the ontology
hierarchy for which P has at least one qualified document in its local document collection
(TSBF2,P (C)). To enable associating the number of relevant documents reachable through P for
each query Q we implement each bit array in level 2 as Spectral Bloom Filters(SBFs). SBFs
maintains a counter per each bit in a bit array thereby allowing deriving the number of times an
element is added to bit array. In a level 2 bit array elements are queries and a given query Q is
added as many times as the number of qualified documents exist in P for Q.
Intuitively, while level 1 contain a bit array per ontology concept, level 2 only needs to maintain a
bit array per non-leaf ontology concept. Posed with a query Q, TSBF2,P returns the actual number
of documents in P qualifies as answers for Q whereas the TSBF1,P provides an estimate of the
same. While both provide two alternative ways for estimating the number of documents in P for
Q, TSBF2,P is proffered over the other as it provides exact information regarding P’s strength for a
given query. When Q does not exist as a member in TSBF2,P then TSBF1,P is used to derive an
estimate.
In our implementation, we size the set of bit arrays in both levels in a TSBF the same , but their
sizes can be adjusted independently. Similar to traditional Bloom filters, TSBF uses a fixed
number of hash functions h1,h2,...,hk. Figure 3 shows a TSBF of a peer P for the ontology given in
Figure 1. In the figure, the TSBF is a set of 10 bit arrays, 3 of which is for level 2, each
representing possible queries per nn-leaf concept in the ontology while the other 7 is for level 1,
where each bit array represents documents reachable for the given concept.
8. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
18
Figure 3. The TSBF for a peer P with documents d1, d2, d3, d4, d5, d6. for ontology in Figure 1.
Such a TSBF of a peer serves two purposes: it allows fast query processing within the peer and
also serve as the foundation for constructing peer’s routing index. Use of Spectral Bloom filters in
place of traditional Bloom filters allows associating a frequency of occurrence to each individual
document/concept combination which adds to the information quality of Routing indices.
1) TSBF Construction: The TSBF of a peer P, TSBFP is constructed at start-up when peer joins
the network for the first time. Initially all the bits of level 1 bit arrays are initialized to false while
the all the bits in level 2 are initialized with a counter value zero. To populate level 1 bit arrays
peer can analyze its document vectors to identify those qualifying documents per ontology
concept and insert into the bit array responsible for the concept to populate them. In other words,
P insert all its local documents that can answer concept C in its TSBF1,P (C) bit array.
Following the same process for populating level 2 bit arrays, however, leads to practical issues.
We could ideally generate all possible concept combinations (queries)that exceeds a predefined
cosine similarity threshold per local document of P and insert them in the appropriate bit arrays in
level 2. Note that, same concept combination can be answered successfully by multiple
documents thus allowing the combination to be inserted into the relevant bit array that many
times. For instance, a concept combination answerable by a document d in P is inserted into
TSBF2,P (C) where C is the least common ancestor of concepts in combination based on the
reference ontology. This method of level 2 TSBF population is certainly a viable option for those
peers with sufficiently large amount of storage at hand as the number of concept combination that
can be answerable by a peers document collection could be quite large. Such a technique is often
unfavorable, however, as this might lead to high false positive rates due to insertion of large
number of elements to bit arrays. Moreover, there is no need to generate and insert queries of all
possible lengths to a TSBF. As a query length analysis [19] on various search engines conducted
in 2011 indicate, an average length of a query contains 3.08 terms and more than 92% queries
contain only 5 terms or less. Moreover, there may be concept combinations that end user is
simply not interested in as queries and the number of such concept combinations could be
unnecessarily large. Therefore we propose that a peers dynamically build TSBF level 2 based on
the queries it receives by adding only those queries that it can answer using its documents or
through neighborhood. This will significantly lower the useless concept combinations in a TSBF
level 2 thus controlling the false positive rate from going up. A peer also has the capability of
aggressively building its TSBF level 2 by periodically gossiping with neighbors to exchange their
query histories to discover new concept combinations that should be in its TSBF.
9. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
19
4.2. Routing Index
The routing index of a peer summarizes information of its neighborhood useful for intelligent
query routing. One of the key requirements of the routing index of a peer is to represent all
possible queries answerable by a neighbor along with the corresponding relevance strengths of
the neighbor per each individual query in a compact manner. The routing index structure achieves
this by summarizing TSBF’s of neighbors at the peer. The routing index of a peer P consist of a
set of <key, value> pairs where key is a neighbor N and value is a TSBFP→N , a summarized
TSBF that represents the set of documents reachable through N and the set of queries answerable
through the N based on the set of peers in the network N can reach. TSBFP→N generally represents
an aggregation of a set of TSBFP′ structures each belonging to a peer P′ reachable through P.
4.2.1 Routing Index Updation
The drastic reduction in the overheads of routing table construction and maintenance in BSI is
achieved through use of TSBFs. At startup, peers exchange their TSBFs with each other.
Therefore the entry for neighbor N, TSBFP→N, at P’s routing index is initially a replica of TSBFN
at N representing the set of documents and queries in N’s local storage. This is later updated by
information carried by N to P from different query routes to represent documents and queries
reachable through N. Particularly, presented with piggybacked information carried in a query
path, an update at P aggregates TSBFPi of each peer Pi visited by the query received through N, by
taking the union equivalent of them subjecting bit counts to exponential decay depending on the
hop distance of Pi to P. The exponential decay of counts in bits during forwarding of search
messages, exponentially reduces the impact of a peer’s TSBF on another peer’s routing index
with the distance between the two.
Equation 3 gives the exponential decay based union for combining TSBFs of set of peers Nj for a
concept c. x here represents the level of the TSBF(i.e.x ∈{1,2}) and α is the decay factor. We
simply used the distance between peer Nj to P as α. Here the corresponding bits of respective bit
arrays are aggregated using a function F. F is simply bitwise OR between bit i of corresponding
bit arrays in all TSBFs in the query path for level 1 TSBFs and is sum for the same for level 2
TSBFs bit arrays. To avoid information becoming stale, periodic updates are triggered by peers
updating Routing index for each neighbor based on the messages it received. Even though the
updates are triggered periodically, the updates occur only if a peer sees a substantial difference in
the values in the bits in bit arrays it is interested in.
ܶܵܨܤݔ݁,ݔܷ݊݅݊ ሺܿሻ. ܾ݅ݐ݅ = ܨ݆ =1
݇
ቆ
ܶܵܨܤܰ.ݔ݆
ሺܿሻ. ܾ݅ݐ݅
ߙ
ቇ (3)
The above design achieves our primary objective of efficiently maintaining probabilistic
information about the content stored in the neighborhood. Algorithm 1 summarizes the routing
index creation and updating procedure.
4.2.2 Routing Index Compression
To observe dynamically changing space restrictions, a peer can simply compress routing index at
will by exploiting ontology IS-A structure. The main intuition here is that, due to IS-A relations,
the ancestor concepts in an ontology hierarchy is fully qualified to represent a descendant
concept. Therefore the TSBFP→N of a neighbor N in P’s routing index can be compressed even
further by combining the bit arrays for two or more concepts sharing a IS-A relation by applying
union equivalent of represented sets to corresponding bit arrays. for example, corresponding level
10. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
20
bit arrays of B and D in ontology in Figure 1 can be combined to generate one bit array
represented at D this way.
This union equivalent operation for a set of such TSBFs for a neighbor N at P representing a
combining set of concepts sharing IS-A relations is given in Equation 4. The level x is 1 or 2 and
Ca is the ancestor concept of all descendant concepts cj being aggregated. Similar to Equation 3, F
refers to bitwise or and summation of counts for level 1 and level 2 respectively.
ܶܵܨܤݔ,݊݅݊ݑ ሺܿܽ ሻ. ܾ݅ݐ݅ = ܨ݆ =1
݇
൫ܶܵܨܤݔ,ܲ→ܰሺ݆ܿ ൯. ܾ݅ݐ݅ (4)
When the need arises to compress a routing index, the priority first goes to combining bit arrays
for different concepts at level 2 (i.e. TSBF2,P) as compressing this does not result in minimal loss
of information. The reason for this is, TSBF2,P represent exact information regarding frequency of
concept combinations. The second priority is given to combining level 1 TSBF1,P bit arrays. Since
this is used for estimation purposes, combining multiple bit arrays for different concept into one
will degrade the estimation accuracy. Similarly, when choosing which bit arrays to combine in a
given level, the higher probability is given for those bit arrays representing higher level concepts
closer to the root as the information represented by the ontology generalizes towards ontology
root.
Combining multiple bit arrays into one, however, results in slightly increased false positive rate
and can be controlled by either not having excessively large number of items in bit arrays or by
allowing bit arrays to grow as needed to maintain the desired false positive rate.
Algorithm 1 Routing Index Construction and Maintenance
Input:
Msg = Message
N = Neighbor sending Msg
TSBFList = The list of TSBFs in Msg, list index+1 denote distance between P and peer to
which TSBF at index belong
P = Message receiver peer executing algorithm
C = set of concepts in reference ontology
Output: ‘
selected = the top k
1:
∇ Create index entry for N for alive-ping message
2: if Msg.TYPE = ALIVE then
3: TSBFN ← TSBFList.get(0)
4: for each Concept c ∈ C do
5: for i = 0 to TSBFN(c).length - 1 do
6: index:TSBF1,P→N(c).biti ← TSBF1,N (c).biti
7: index:TSBF2,P→N(c).biti ← TSBF2,N (c).biti
8: end for
9: end for
10: end if
11: ∇ Remove index entry for N for leave-ping message
12: if Msg.TYPE = LEAVE then
13: Index.remove(N)
14: end if
15: ∇ Update index entry of N for information received through a search related message
11. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
21
16: if Msg.Type = QUERY or HIT or MISS then
17: ∇ Determine concepts for which Routing index should be updated
18: Query q ← Msg.query
19: lca(q) ← least common ancestor concept of q
20: Cupdate ← empty set representing concepts for which index.TSBFP (N) should be updated
21: Cupdate.add(lca(q))
22: Cupdate.addAll(q)
23: ∇ Update Routing index entry for N
24: for each Concept c ∈ Cupdate do
25: ∇ Calculate a new TSBF entry for N based on TSBFList
26: TSBF1,N′ (c) ← calculate using Equation 3
27: TSBF2,N′ (c) ← calculate using Equation 3
28: if TSBF1,N′ (c).biti > index.TSBF1,P→N(c).biti then
29: Index.TSBF1,P→N′(c).biti ← TSBF1,N′ (C).biti
30: end if
31: if TSBF2,N′ (c).biti > index.TSBF2,P→N(c).biti then
32: Index.TSBF2,P→N(c).biti ←TSBF2,path(C).biti
33: end if
34: end for
35: end if
4.3. Query Routing
The query routing algorithm is summarized in Algorithm 2. The objective of search in BSI is to
retrieve many relevant documents as possible for a given query. The search messages in BSI are
TTL bounded and therefore a query can travel maximum TTL hops before it is discarded. Every
query receiver peer performs a local search and append results of retrieved documents to the
query and then forwards the query to the most promising neighbor if the TTL has not expired.
Upon TTL expiration, the peer returns a response message to the query originator along with the
retrieved results. The neighbor selection process at a peer P for a query q is performed as follows:
Upon receipt of q, P checks TSBFP→N associated with each of its neighbor N. The lookup returns
the frequency of documents (exceeding a relevance threshold) in each neighbor N containing
query as a concept combination. This denotes the relative probability of success of N for the
query compared to its neighbors. To calculate this TSBF, P first lookup in level 2, TSBFP→N (c),
where c is the least common ancestor of concepts in query. If query exist as an element, TSBFP→N
returns the associated document frequency. The fact that TSBF2,P→N (c) does not contain q as a
concept combination does not necessarily mean that combination is not reachable through N. It
may be a situation arising from P not exploiting all reachable peers in its TTL hop radius or P has
not encountered q in its query history. In such a case P calculates an estimate of the reachable
documents for a multi-concept queries based on TSBF2,P→N from its routing index. This estimation
procedure is described below.
1) Estimating set intersection based cardinality from Bloom filters: We need a reliable
mechanism to find the number of documents reachable through a neighbor for a given query
solely based on the set of Bloom filters each representing documents reachable through the
neighbor for each query concept. Research community has proposed many works to estimate the
cardinality(i.e. number of elements) of an original set solely based on its Bloom filter bit array
[14], [16], [17]. For our work we used the work presented by authors of [16].
12. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
22
Given a set S and its Bloom Filter BFS with m bits out of which t are true bits, and k hash
functions [16] defines the cardinality of a set is as follows:
|ܵ| =
݈݊ ቀ1 −
ݐ
݉ቁ
݇ × ݈݊ ቀ1 −
1
݉ቁ
(5)
Here, the m denotes the length of the Bloom filter and t the number of true bits in the Bloom
filter. |S| denotes the cardinality of the set represented by the Bloom filter bit array.
For cardinality estimation of union of multiple sets S∪ = S1∪ S2∪...∪Sn (needed for OR query
routing) authors simply propose to take the bitwise OR of the representative Bloom filter bit
arrays and apply equation 5 to the resulting Bloom filter. Estimating cardinality of intersection of
sets based on bloom filters is not so straightforward. We used the set inclusion-exclusion
principle in combinatorics to derive the cardinality of intersection of sets based on the
cardinalities of the unions of sets. We employed the equation 5 to calculate the intersection of sets
S∩ = S1∩ S2∩...∩Sn:
อሩ ܵ݅
݊
݅=1
อ = ሺ−1ሻ݇+1
൭ |ܵ݅ ∪ … ܵ݇|
ݒ
1≤݅<݇≤݊
൱
݊
݇=1
(6)
Each set Si in our work represent the set of documents reachable for each concept ci for a given
query through a given peer P. The equation 5 only requires cardinalities of individual sets and
unions of sets to compute the cardinality of intersection. Therefore upon a receipt of a query a
peer can simply use above mentioned Bloom filter based cardinality estimation techniques to
calculate each term in right hand side of Equation 6 thus computing the cardinality of intersection
for a neighbor for a given query.
Algorithm 2 Query Routing
Input:
Q = Query Message
P = Message receiver peer
index = The routing index of P
Candidates(Q) = Neighborhood(P) - Q.VisitedPeers
k = the walker count
Output: ‘
R = Retrieved documents
Selected(Q) = the selected peers for query forwarding
1: Selected(Q) ← φ
2: Q.TTL ← Q.TTL - 1
3: R ← LocalSearch(o)
4: ∇ Found target Object
5: if Q.TTL ≤ 0 then
6: if R ≠ φ then
7: Return a HIT with R and terminate search
8: Else
9: Return a MISS with R and terminate search
10: end if
13. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
23
11: Else
12: Clca ← least common ancestor concept of Q
13: for each Neighbor N ∈ Candidates(Q) do
14: ∇ Look up score in index.TSBF2,P→N
15: if index.TSBF2,P→N(Clca) contains Q then
16: ScoreN ← minimum count of all bits hashed by Q in index.TSBF2,P→N(Clca)
17: else ∇ Estimate score from index.TSBF1,P→N
18: QueryBFListN ← φ
19: for each Concept c ∈ Q do
20: QueryBFListN.add(index. TSBF1,P→N(c))
21: end for
22: ScoreN ← calculate using Equation 6 with QueryBFListN as input
23: end if
24: Sort Candidates(Q) based on scores
25: Selected(Q) ← Neighbor in Candidates(Q) with highest score
26: end for
27: Return Selected(Q) and R
28: end if
5. EXPERIMENTS
In this section we present a simulation-based evaluation of BSI and compare its performance with
other mechanisms for query routing in unstructured peer to peer systems.
5.1. Simulation Methodology
The parameters for the simulation TSBF used in BSI are listed in Table 1. The simulations use a
default 1024 peer unstructured network, unless noted otherwise. To simulate the dynamic
behavior of the network under peer churn, we inserted online nodes to the network while
removing active nodes at varying frequencies. During a single simulation run, the network size
was maintained at roughly the original starting size by ensuring the number of nodes that join the
network dynamically was the same as that leaving the network. On an average, 80 nodes each are
added and removed from the network during each simulation run.
We mainly compare our work against OSQR [7] and Random Walk [8]. OSQR is a concept based
indexing mechanism which tries to improve the performance for multi-concept queries with a
concept based routing index. The OSQR routing index of a peer maintains the total reachable
documents per neighbor per concept in a reference ontology as the relevance strengths of the
neighbor for each ontology concept. The authors propose taking the minimum of the number of
documents reachable for each query concepts for a given neighbor as its relevance strength to the
query. For the simulation, the maximum number of entries in OSQR index was set to 128 and size
of maxVector (a data structure used by peers for global knowledge acquisition) was set to 10.
Random walk on the other hand is a popular blind search mechanism where a querying peer
deploys a search walkers by sending query to a randomly selected neighbor. The query format as
well as local search remains the same as our proposed BSI algorithm. We experimented with two
versions of BSI based on the design of the TSBF data structure:
1) BSI(TSBF-level1+level2): This is the original BSI algorithm where the TSBF structure
contains two levels: level1 contains a list of bit arrays each representing the set of documents
14. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
24
accessible per ontology concept and level2 contains a list of bit arrays each representing a set of
queries and the associated document frequencies per ontology concept of a peer.
2) BSI(TSBF-level1): This represents a reduced level BSI algorithm where TSBFs solely consists
of level1 only. Thus the peers rely on the reachable documents estimation process in query
routing.
Table 1. Simulation Parameters.
Parameter Range and default value
Network size 28
-211
Default:210
Peer degree distribution power law
Total distributed documents 5000
Average number of concepts per document 20
Document distribution among peers zipf (α=1.0)
Query distribution among peers zipf (α=1.2)
Mean peer documents 100
Reference Ontology Concepts 128
TTL 1-11 Default:7
Relevance threshold 0.7
TSBF filter length 250bits
No. of hash functions 7
5.2. Results
The primary performance results can be summarized as follows. For a low routing overhead, the
query recall improves by up to five orders of magnitude over OSQR proving the efficiency of the
algorithm in locating as many relevant documents as possible in a low search cost (Section 5.2.1).
Simulations over different query lengths and different filter sizes demonstrate the scalability of
BSI for multi-term query routing in sections 5.2.2 and 5.2.3 respectively. On average we observed
up to four fold performance improvement in terms of recall over OSQR for various query lengths
and different filter sizes. Finally, a study of the impact of search cost on overall search efficiency
in section 5.2.4 demonstrates that the performance improvement in recall is achieved at an
attractive two-fold improvement in search traffic over OSQR.
5.2.1 Recall
Recall is a standard performance measurement in information retrieval that denotes the fraction of
relevant documents found by an average query compared to the entire set of relevant documents
distributed in the network. While many query routing mechanisms can achieve high recalls at
sufficiently high hop limits only superior query routing algorithms can achieve high recalls with
low hop limits. Varying the hop limit allows us to demonstrate the effectiveness of our algorithm
in finding larger number of documents at low TTL values thus with lower search cost. Figure 4(a)
plots the average recall per query for BSI, OSQR and Random Walk. We observed that original
BSI (i.e. (TSBF-level1+level2)) algorithm achieved an average of 383.71% increase in recall over
OSQR while BSI(TSBF-level1) gained a 322.60% increase in recall over OSQR. The
performance improvement of BSI(TSBF-level1+level2) over Random walk was observed to be an
outstanding 1238.81%. The increase in recall is more evident in higher TTL values. From the
figure we can also infer that the original BSI with both levels present in the TSBF structure
performs better than BSI with the TSBF with only level1. An average increase of 45.16%
improvement was observed in BSI (TSBF level1+level2) over BSI (TSBF-level1). This is
15. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
25
expected, as having both levels in a TSBF obviously provide more information regarding query
relevance strength of a peer to make a intelligent query routing decision compared to a TSBF
level 1 which only provide an estimate of the same. As Figure 4(b) suggests, both versions of BSI
clearly outperforms OSQR and Random Walk in terms of recall regardless of the network size
thus proving the scalability of our algorithm.
(a) Recall vs. TTL (b) Recall vs. Network Size
Figure 4. Recall for Single Concept Query (Filter Size = 250bits)
5.2.2 Effect of Query Length
Effect of Query Length: One of the key design objectives of BSI is to achieve high recalls for
queries with more than single concept. Figure 5 plots the effect of query length on recall. We
observed a significant improvement in recall in BSI over OSQR for both BSI versions:
BSI(TSBF-level1+level2), and BSI(TSBF-level1). On average for all query sizes we observed
that recall improves four orders of magnitude (315.79%) and by three orders of magnitude
(223.88%) compared to OSQR in BSI(TSBF-level1+level2) and BSI(TSBF-level1) respectively.
Recall improves ten orders of magnitude (903.62%) and by three orders of magnitude (682.01%)
compared to Random Walk in BSI(TSBF-level1+level2) and BSI(TSBF-level1) respectively. The
superiority of BSI was more evident for queries with higher number of concepts. The reason for
this behavior can be explained using the peer knowledge summarization and the neighbor
selection mechanisms of all three algorithms. Random walk does not use any intelligence in
neighbor selection mechanism. Peers randomly select one if their neighbors to forward queries.
BSI and OSQR estimate the number of documents reachable thorough a peer for a query as its
relevance score in neighbor selection process. An OSQR peer only maintains the number of
reachable documents per ontology concept for each neighbor in its routing index and a peer
estimates the number of documents reachable from a given neighbor by taking the minimum of
the number of documents reachable for each queried concept for that neighbor. Taking the
minimum as the relevance score, however, is an optimistic estimate. This actually represents an
upper bound of the number of actual relevant documents reachable through a neighbor.
16. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
26
Figure 5. Recall vs. Query Length (TTL=7, Filter Size=250bits).
To take a more accurate decision, the decision taking peer must ideally have the set of documents
reachable through the neighbor for each queried concept and should perform a intersection of all
sets of documents relevant to the query to obtain the set of documents actually reachable through
that neighbor for the query. The size of this resulting set will accurately represent the number of
documents reachable through the neighbor for that query. However, maintaining such detailed
information is too costly in P2P networks. BSI compensates this information loss by maintaining
bloom filters representing the queries (along with number of documents reachable as associated
bit counts) and documents reachable through a neighbor, thereby providing a more compact and
high quality information summarization leading to more accurate estimations of number of
documents reachable for queries.
5.2.3 Effect of Filter Size
Having established the scalability of our general approach, we now turn our attention to the effect
of Bloom filter size in recall and storage cost.
Using TSBFs for local knowledge maintenance results in substantial increase in recall at low
storage cost. As shown in Figure 6 recall substantially increases with filter size in all BSI
versions. This is the expected behavior as larger filter sizes can keep large volume of information
at a tolerable false positive rate resulting in better query routing decision leading to high recall
rates. On average we observed a 300.47% and 216.37% average improvement in recall over
OSQR and a 1038.60% and 799.50% average improvement in recall over Random Walk for BSI
(TSBF level1+level2) and BSI (TSBF-level1) respectively.
Table. 2 shows the associated storage costs for various filter sizes. This storage cost accounts for
a peer’s TSBF and its routing index. Storage costs are shown for original BSI(TSBF with level 1
and level 2 present) and for two modified BSI versions where a TSBF contain only level1 or level
2 respectively. While maintaining a large filter size allows us to decrease the false positive rate
we observed that it substantially increases the TSBF size. However, while storing a TSBF in a
peer’s routing index per neighbor can contribute to increase in index sizes, this storage
requirement is not a burden to an average internet computing device. Moreover, as long as the
false positive rate of the TSBF is at an acceptable level for the application, there is no need to
unnecessarily increase the filter size. We found through experimentation that the optimal filter
17. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
27
size for our trace was approximately 250 bits. However, those peers that have storage limitations
and cannot afford to reserve a filter size below this size have the option of either compressing
their routing indices as illustrated in Section 4.2.2 or to eliminate one of the levels of TSBFs
stored in routing indices to observe their storage restrictions.
Figure 6. Recall vs. Filter Size (TTL=7, Average Query Length=2.5)
Table 2. Storage Overhead.
Algorithm Filter size (bits)
100 300 500
OSQR 10752.01 10752.01 10752.01
BSI (TSBF level 1+level 2) 4875.00 14625.00 24375.00
BSI (TSBF-level 1) 4800.00 14400.00 24000.00
BSI (TSBF-level 2) 56.25 168.75 281.25
5.2.4 Search Overhead
Using TSBFs for knowledge transfer and acquisition process results in significant lowering of
data transmission. OSQR consumes a significant amount of bandwidth in data transfer. OSQR on
average consumed 13.94KB of bandwidth per query in data transfer. This was in contrast to the
4.13KB and 3.88KB consumed by a query in BSI (TSBF level 1+level 2)and BSI (TSBF-level 1)
respectively. This corresponded to a 70.35%, and 95.46% reduction over OSQR in traffic cost for
BSI (TSBF level 1+level 2)and BSI (TSBF-level 1) respectively. Random Walk search message
consumed only 1.02KB bandwidth as a message does not carry additional information for
knowledge dissemination. The high search cost of OSQR is attributed to the fact that it transmits
routing index information about each taxonomy concept for every visited neighbor as feedback.
Queries in BSI on the other hand transfers much more space efficient TSBFs relevant only to the
queried taxonomy concepts thus significantly reducing the bandwidth consumption.
18. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
28
Figure 7. Search Cost (TTL=7, Filter Size=250bits, Average Query Length=2.5)
To get an accurate picture of overall search efficiency of a search algorithm we need a combined
measure of both recall and search cost. Therefore we define the search efficiency to be the ratio of
recall with search cost. In Figure 8 we plot the search efficiency of BSI, OSQR and Random
Walk against network size. From the results it is clear that efficiency of BSI (both versions) is
much higher compared to OSQR and Random Walk, and BSI is capable of producing high recall
at much lower message cost.
Figure 8. Search Efficiency (TTL=7, Filter Size=250bits, Average Query Length=2.5)
6. CONCLUSION AND FUTURE WORK
Routing in unstructured P2P networks is a challenging problem. In this work, we have explored
the approach of compact representation of large semantic query based indices through the use of
Bloom filters. Our Bloom filter based routing index quantifies the strength of a peer’s
neighborhood for individual queries and then uses this information for forwarding queries. The
simple yet powerful design of BSI makes it easy to implement. Simulation based comparison with
state-of-the art query routing mechanisms, under a wide variety of scenarios establish the
performance advantages of BSI. The BSI search framework presents interesting possibilities of
implementing high level semantics of trust, reliability, etc. using routing and forwarding policies.
19. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
29
REFERENCES
[1] Napster website. http://www.napster.com/, 2006.
[2] Gnutella website. http://www.gnutella.com/, January 2007.
[3] Madhan Arumugam, Amit Sheth, and I. Budak Arpinar. “Towards peer-to-peer semantic web: A
distributed environment for sharing semantic knowledge on the web”. Technical report, 2001.
[4] Michael Bendersky and W. Bruce Croft, “Discovering key concepts in verbose queries” In
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in
information retrieval, ACM SIGIR ’08, pages 491–498, New York, NY, USA.
[5] Michael W. Berry, Zlatko Drmac, and Elizabeth R. Jessup. “Matrices, vector spaces, and information
retrieval” In SIAM Rev., 41(2):335–362.
[6] Arturo Crespo and Hector Garcia-Molina. “Semantic overlay networks for p2p systems” In
Proceedings of the Third international conference on Agents and Peer-to-Peer Computing, AP2PC’04,
pages 1–13, 2004.
[7] Rasanjalee Dissanayaka, S.B. Navathe, and Sushil K. Prasad. “OSQR: A framework for ontology-
based semantic query routing in unstructured p2p networks” In Proceedings of international IEEE
HiPC conference on High Performance Computing, HiPC ’12, 2012.
[8] Christos Gkantsidis, Milena Mihail, and Amin Saberi. “Random walks in peer-to-peer networks:
algorithms and evaluation” Perform. Eval., 63(3):241–263, 2006.
[9] Hai Jin and Hanhua Chen. “Semrex: Efficient search in a semantic overlay for literature retrieval”
Future Gener. Comput. Syst., 24(6):475–488, 2008.
[10] Vana Kalogeraki, Dimitrios Gunopulos, and D. Zeinalipour-Yazti. “A local search mechanism for
peer-to-peer networks” In Proceedings of the eleventh international conference on Information and
knowledge management, ACM CIKM ’02, pages 300–307, 2002.
[11] Juan Li and Son Vuong. “Ontsum: A semantic query routing scheme in p2p networks based on
concise ontology indexing” In Proceedings of the 21st International Conference on Advanced
Networking and Applications, IEEE AINA ’07, pages 94–101, 2007.
[12] Qin Lv, Pei Cao, Edith Cohen, Kai Li and Scott Shenker. “Search and replication in unstructured
peer-to-peer networks” In Proceedings of the 16th international conference on Supercomputing, ICS
’02, ACM, pages 84–95, New York, NY, USA, 2002.
[13] H. Mashayekhi, J. Habibi, and H. Rostami. “Efficient semantic based search in unstructured peer-to-
peer networks” In Modeling Simulation, 2008. AICMS 08. Second Asia International Conference on,
pages 71–76, 2008.
[14] Loizos Michael, Wolfgang Nejdl, Odysseas Papapetrou, and Wolf Siberski. “Improving distributed
join efficiency with extended bloom filter operations” In AINA IEEE, pages 187–194, 2007.
[15] Wolfgang Nejdl, Boris Wolf, Changtao Qu, Stefan Decker, Michael Sintek, Ambj¨orn Naeve, Mikael
Nilsson, Matthias Palm´er, and Tore Risch. “Edutella: a p2p networking infrastructure based on rdf”
In Proceedings of the 11th international conference on World Wide Web, ACM ’02, pages 604–615,
New York, NY, USA, 2002.
[16] Odysseas Papapetrou, Wolf Siberski, and Wolfgang Nejdl. “Cardinality estimation and dynamic
length adaptation for bloom filters” Distrib. Parallel Databases, 28(2-3):119–156, December 2010.
[17] Sukriti Ramesh, Odysseas Papapetrou, and Wolf Siberski. “Optimizing distributed joins with bloom
filters” In Proceedings of the 5th International Conference on Distributed Computing and Internet
Technology, ICDCIT ’08, Springer-Verlag, pages 145–156, Berlin, Heidelberg, 2009.
[18] Habib Rostami, Jafar Habibi, Hassan Abolhassani, Mojtaba Amirkhani, and Ali Rahnama. “An
ontology based local index in p2p networks” In Proceedings of the Second International Conference
on Semantics, Knowledge, and Grid, SKG ’06, IEEE, pages 11–, Washington, DC, USA, 2006.
[19] Mona Taghavi, Ahmed Patel, Nikita Schmidt, Christopher Wills, and Yiqi Tew. “An analysis of web
proxy logs with query distribution pattern approach for search engines” Computer Standards &
Interfaces, 34(1):162 – 170, 2012.
[20] Chunqiang Tang, Zhichen Xu, and Mallik Mahalingam. “psearch: information retrieval in structured
overlays” SIGCOMM Comput. Commun. Rev., 33(1):89–94, January 2003.
[21] Beverly Yang and Hector Garcia-molina. “Efficient search in peer-to-peer networks” In Proceedings
of the 22nd International Conference on Distributed Computing Systems, 2002.
20. International Journal of Peer to Peer Networks (IJP2P) Vol.6, No.1, February 2015
30
AUTHORS
Rasanjalee Dissanayaka received her Ph.D. in Computer Science from Georgia State
University in 2014. She is currently serving as an Assistant Professor of Computer
Science at College of St. Benedict & St. Johns University since August 2014. Her main
research interests are in the areas of distributed algorithms and semantic computing .
Rasanjalee's publications include papers in international journals and conference
proceedings.
Sushil K. Prasad (BTech’85 IIT Kharagpur, MS’86 Washington State, Pullman; PhD’90
Central Florida, Orlando - all in Computer Science/Engineering) is a Professor of
Computer Science at Georgia State University (GSU) and Director of Distributed and
Mobile Systems (DiMoS) Lab. He has carried out theoretical as well as experimental
research in parallel and distributed computing, resulting in 120+ refereed publications,
several patent applications, and about $3M in external research funds as principal
investigator and over $6M overall (NSF/NIH/GRA/Industry).
Shamkant Navathe is currently a professor at the College of Computing, Georgia
Institute of Technology, Atlanta. He is known for his work on database modeling,
database conversion, database design, distributed database fragmentation, allocation,
and redesign, and database integration. Current projects include biological genome
databases, data mining, modeling of database security, and knowledge base design. He
has worked with IBM and Siemens in their research divisions and has been a
consultant to various companies, including Honeywell, Nixdorf, CCA, ADR, Digital,
MCC, Harris, Equifax and Hewlett Packard corporations.
Vikram Goyal is a faculty in IIIT Delhi. He received his PhD in Computer Science
and Engineering from the Department of Computer Science and Engineering at IIT
Delhi in 2009. Before pursuing PhD, he completed his M.Tech in Information
Systems from the Department of Computer Science and Engineering at NSIT Delhi in
2003. His primary area of research is Spatial-Textual Data Management and Data
Privacy. He has been given a DST Young Scientist Award for his research on data
privacy in Location-based Services.