Group Finder

1
GROUPFINDER
ANDRIOD APPLICATION FOR EXPLORING DISCUSSION BOARDS
Abhidnya Patil (91882839)
Soham Kulkarni (20005264)
Madhur J. Bajaj (36562594)
Department of Informatics and Computer Science
University of California, Irvine
ABSTRACT
The Internet has changed significantly the
scale of distributed systems, propelling the
interest for more adaptable communication
models and systems. Person point-to-point and
synchronous communications, which tend to
prompt inflexible, static applications, are
capitulating for the even more approximately
coupled communication which is bolstered by a
publish-subscribe worldview. In this report, we
characterize a scientific classification for
considering publish-subscribe middleware,
referring to cases from the systems
incorporated into the study. We review existing
publish-subscribe systems to find its application
in our project and examine their elements that
help in scaling the infrastructure rapidly. The
paper also presents a study of how integrating
the publish-subscribe systems which forms the
backbone of our Notification Manager with
Search Engine to improve content relevance
with regards to the GroupFinder. The study
contains cases that illustrate utilization of a
distinctive publish-subscribe frameworks.
I. INTRODUCTION
The publish-subscribe paradigm is accepting
expanded consideration for the approximately
coupled type of collaboration it gives in huge
scale settings. All in all, subscribers enroll their
interests in a topic or an example of events and
after that asynchronously get events
coordinating their advantage, paying little heed
to the events' publisher [5]
. In this paper, we
initially characterize a scientific categorization
for looking into publish-subscribe systems,
referring to cases from the systems incorporated
into the review. At that point, we give a few
cases of existing publish-subscribe systems. The
systems inspected in this overview are drawn
from independent publish-subscribe systems
presented in the references. Because of its
asynchronous nature and inherent decoupling
properties, the distributed environment based
publish/subscribe middleware has been
generally utilized as a part of the outline of many
distributed applications.
The routing algorithm utilized by a pub/sub
framework is vital to overseeing execution, load
distribution, and scalability. Subsequently,
systems like DYNATOPS [1]
, MERC, and DDBR [2]
may offer help for information streams, their
motivation is more extensive than independent
support for the communication paradigm so
they are excluded here. Finally, we exhibit a
basic code case of a publisher and subscriber
utilizing the Feedback based Search Engine [5] as
the case API. Publish-subscribe systems must
address versatility, in terms of subscription
administration, and effective coordination.

2
II. PROBLEM STATEMENT
Group-Finder, a symposium for exploring or
adding Social Media Groups across the world
ranging from Job Search, Fun, Entertainment,
University Groups, Housing, and Recreational
Activities. In this project, we present an
infrastructure for searching various discussion
boards from the corpus. which is a compilation
of links to all the discussion boards ranging from
Facebook, WhatsApp, Quora etc. The aim of the
project is to get rid of all the hassle that one must
go through to reach a pertinent message board.
The Project Overview can be seen in Fig.4
III. RELATED WORK
Publish-subscribe systems vary on various
central attributes. The most famous
disintegration of systems is into the general
classifications of subject-based or content-based
systems. Systems vary on their structures too.
Publish-subscribe systems vary on various
central attributes. The most famous
disintegration of systems is into the general
classifications of subject-based or content-based
systems. Systems vary on their structures too.
Publish-subscribe systems can be push-based,
pull-based, or both. In push-based, messages are
naturally communicated to subscribers. This
model gives tight consistency and stores
negligible data. Pull-based models can be more
receptive to client needs.
A. Topic-Based Vs Content-Based
There are two general classifications of
publish-subscribe systems, subject-based or
content based. In subject-based systems, a
message has a place with one of a settled
arrangement of what are differently alluded to
as groups, channels, or points. Subscription
focuses on a group, channel, or theme, and the
client gets all events that are related with that
group. Brokering an association amongst
publishers and subscribers is the demonstration
of interfacing a channel provider with a channel
purchaser.
• In a topic-based system, messages are
published to "topics" or named coherent
channels. Subscribers in a topic-based
system will get all messages published to
the topics to which they subscribe, and
all subscribers to a topic will get similar
messages.
• In a content-based system, messages are
just delivered to a subscriber if the
attributes or content of those messages
matches requirements characterized by
the subscriber. With regards to our
application the meta data associated
with every group constitutes as the
content associated with respective
group.
B. DYNATOPS
Late research on topic-based pub/sub has
investigated ways to deal with upgrade pub/sub
proficiency by developing ideal overlay topology
to reduce the quantity of inconsequential
intermediate overlay hops. DYNATOPS solves
this issue. DYNATOPS, a dynamic topic-based
pub/sub architecture that provides efficient
scalable societal scale event notifications for
dynamic subscriptions using distributed broker
networks. [1]
.

3
DYNATOPS involves two Network
spaces, firstly Pub/Sub User space and secondly
Broker Space. In DYNATOPS, users are
dynamically allocated brokers that oversee
subscriptions from comparative users; the
broker system is based upon organized overlays
that are adaptively reconfigured under dynamic
subscription changes utilizing a subscription-
and-structure mindful broker mapping method.
The principle behind the broker assignment is
the rapid change in subscription pattern.
DYNATOPS proposes a mechanism
which minimizes the number of revisions to be
made, initially by dynamic assignment of users to
the brokers and subsequently dynamic
assignment of brokers to the user depending on
the transition of subscriptions. Contrasted with
content-based pub/sub systems, where more
refined subscription administration and event
directing components are required, topic-based
pub/sub permits much easier and more
productive executions. Topic-based pub/sub
systems have been broadly sent in settings
where events isolate normally into groups and
effective data notification is requested. It uses
similarity based broker selection [1]
techniques to
map DYNATOPS users to nearby brokers.
C. Scaling Publish Subscribe Systems
The traditional approach to scale Pub/Sub
Systems is using the filter-based routing
algorithm, but it has a few efficiency and
flexibility limitations[7]
. In order to overcome
these limitations, a Dynamic destination-based
routing algorithm, D-DBR was proposed [12213],
it dichotomizes pub/sub into two tasks, content-
based matching and destination based multi-
casting, leading to low event matching cost and
high efficiency. The proposed procedure
proficiently bunches clients having comparable
interests to be overseen by a similar
arrangement of brokers, to adequately ease the
overhead.
C.1 D-DBR
D-DBR, the pub/sub system is decoupled
into two layers: The matching layer is in charge
of event matching, while the multicasting layer is
in charge of event routing. At the point when a
publisher issues an event at a broker, the event
is coordinated against subscriptions overseen by
the broker's matching engine to acquire the
locations of brokers keen on the event
At that point, the event is delivered to the
intrigued brokers by the multicasting layer based
On the event's goal addresses. After accepting an
event, a goal broker matches the event against
its nearby subscriptions and specifically conveys
it to the intrigued subscribers.
Despite the fact that D-DBR is a compelling
solution, an element constraining its adaptability
is that each broker has to know every single
other broker in the system, and consequently,
the topology upkeep cost can be costly for
expansive scale systems (with hundreds or more
brokers). To relieve this issue and to accomplish
better adaptability, we additionally propose
another routing plan called MERC - Match at
Edge and Route Intra-Cluster. [2]
C.2 MERC
MERC isolates the overlay into
interconnected clusters, where it applies

4
content based furthermore, goal based systems
for between and intra-cluster event routing,
separately.
MERC consolidates destination-based and
content-based routing progressively. It has the
benefits of D-DBR, i.e., low subscription
duplication, low matching expense, and so on. It
additionally defeats the adaptability constraint
of DDBR. MERC trails Internet-like assembly: One
cluster can be seen as an administrative domain
and diverse clusters can be associated in a
hierarchical way. Thus, an engaging normal for
MERC is that it gives a decent reference to
develop vast scale pub/sub systems that imitate
the structure of the Internet.
C.3 PADRES
PADRES (Publish/Subscribe Applied to
Distributed Resource Scheduling) is an
enterprise-grade event management
infrastructure that is designed for large-scale
event management applications. [4]
. PADRES
primarily studies application concerns above the
infrastructure layer, and from the trials executed
on PADRES [Scalable Paper Number] to evaluate
D-DDR, MERC and FBR, in general with about 100
brokers, D-DBR exhibits the best performance
and MERC lies between D-DBR and FBR.
Nevertheless, MERC exhibits a smaller
destination list size than D-DBR.
The overlay structure in DYNATOPS [1]
can be
scaled by introducing D-DBR and MERC routing
algorithms [2]
. These algorithms transfer
communication messages between brokers
efficiently, reducing the routing overhead
introduced in the scaled pub-sub system
D. Notification Service
Notification Service is a propagation
mechanism that goes about as a consistent
intermediary amongst publishers and
subscribers to keep away from every publisher to
need to know all the subscription for every
conceivable subscriber [5]
. Both publishers and
subscribers discuss just with a single entity, the
Notification Service, that stores every
subscription related with the individual
subscribers, gets every one of the notifications
from publishers, dispatches all the published
notification to the right subscribers. With this,
publishers, and subscribers trade data without
straightforwardly knowing each other. This
secrecy is one of the primary components of the
pub/sub system.
E. Feedback Based Search Engine
Web search engines are very useful
information service tools in the Internet. The
current web search engines produce search
results relating to the search terms and the
actual information collected by them. Since the
selections of the search results cannot affect the
future ones, they may not cover most people’s
interests. In this paper, feedback information
produced by the user’s accessing lists will be
represented by the rough set and can
reconstruct the query string and influence the
search results. And thus, the search engines can
provide self-adaptability.
On many occasions, search engines can't
figure out what sort of data users need. We
examined a structure of Feedback Search Engine
(FSE), which not just breaks down the pertinence
amongst inquiries and web-pages additionally
utilizes click through data to assess page-to-page
significance and re-create content applicable
search [6]
. The effective algorithms encouraging

5
the system are depicted. Making utilization of
dynamical re-creating search results, FSE can
give its users more exact and customized data.
IV. ARCHITECTURE
The Implementation of Group-Finder, can be
thought of as divided in three discernable layers.
1. Application Layer
2. Middleware
3. Database Layer
The Overview of the Group-Finder System
Architecture is depicted below in fig 4.
Fig.4: Overview of Group-Finder
Application Layer:
It consists of clients, in our case mobile users
interacting with the system as mobile
Application users. The Clients can perform
various operations, that are elaborated ahead,
but can be superficially seen as request-response
manner of interaction. in Group-Finder the
interaction with Server happens in form of JSON
messages sent utilizing the socket-programming
in Java.
Middleware Layer:
It consists of Server Module along with an
Indexer and Retriever, where Server along with
it sub-ordinate modules functions as the core
middleware. Indexer is used to collect, parse and
store data to facilitate fast and accurate
information retrieval. In brief, Indexer outputs
the corpus for higher-level search engine.
Likewise, Retriever aids in successful fetching of
relevant information from the corpus. In Group-
Finder Implementation, Client Server
communication is based on Socket programming
and Message based service programmed in Java.
The underlying search engine Indexer and
Retriever used to update and fetch information
from the corpus are programmed in Python. In
order for Server to forward client requests to the
search engine modules, it needs encapsulate
these requests and initiate appropriate REST API
call.
Database Layer:
It consists of a bookkeeping module for
mapping the links to the link corpus
structure/division. Likewise, it keeps track of the
mapping of keywords and its frequency in the
respective documents. Every time a new group is
added to the corpus, it needs to be re-indexed.
Dynamic Indexing can be implemented in order
to keep the corpus, most relevant. Though there
are few implementation glitches for dynamic
indexing like maintaining mutual exclusion while
building the corpus, to which we propose two
solutions. Firstly, timely re-indexing at the time
traffic of client requests is the least, even though
it is inexpensive it doesn’t guarantee consistent
corpus at the end of indexing. Secondly creating
a copy of current corpus, followed by update
operation and then replace the parent copy. This
shadow copy approach is very time consuming
and expensive though it guarantees no
vulnerability in terms of consistency.
V. CLIENT SERVER INTERACTION
The Interaction of Client and Server has been
demonstrated in following diagram. Client can
communicate with the Server for four different
utilities, viz. login, search keyword, register
keyword and logout with corresponding request
message.

6
Fig 5. Client-Server Interaction
Login and Logout Client Server interaction is
same as classic user authentication window. In
case of register keyword for instance register
request issued by Client 2 as shown in above
figure. Server forwards the register requests to
register routine which implies, Client 2 is
interested in joining a group with the sent
keyword. register routine logs this, and all the
following users with same keyword search will
be notified about each other using the
notification manager.
In case of search request from the client,
server sends the keyword and the requesting
user’s identity to the search routine through a
REST API call. Search Routine maintains a queue
for all the search requests and processes them
by indeed invoking the retriever. The results of
the search query are then sent are packed and
then sent to Push notification engine so as t be
forwarded to the intended user. In addition to
forwarding the call to search routine, server also
forwards the user’s search query to register
routine. Now if the search keyword matches
with any of the registered keyword, then the
register routine sends the respective message to
the Push Notification Engine so as to forward it
to the intended user. All the exchange of
messages that happens is asynchronous and
thus, if a user is offline when a push notification
module wants to connect it with some other
user, the message is saved as undeliverable and
then delivered when the user is active again. The
Implementation of Group-Finder can be made
more sophisticated and can be integrated with
applications which complements its
functionalities.
VI. CONCLUSION
We have consolidated a scientific
classification for looking at and grouping
changed publish-subscribe systems. This
scientific categorization incorporates subject-
based versus content-based characterization,
system engineering, matching algorithm,
multicasting algorithm, unwavering quality, and
security. Based on such a scientific
categorization, we reviewed a few existing
publish-subscribe systems. From Middleware
and Distributed System perspective, the scaling
of the existing publish-subscribe systems is very
crucial in our project and integrating it with
feedback based search engine helped us
understand the nuances of self-adaptive search
techniques. This review additionally helped us
perceive notable elements, preferences, and
undesirable circumstances of every design and
model keeping in mind the end goal to pick the
most appropriate for project.
VII. FUTURE SCOPE
In case of Server failure, client requests can
to be logged in and processed when Server is Up,
with a view to order improve Fault Tolerance of
the System. Register a keyword functionality can
be complemented with a time-based
registration, for instance direct all users looking
for a job to me until a period of 1 month from
now. Likewise, when subscription pattern of a
user changes, a revision can be made in the
notification service of every Pub/Sub
relationship. Also, to precisely recover search
results, we can use clients click through data to
figure the page-to-page pertinence values and
utilized a Dimension Reduction (DR) algorithm to
pack the significance data.

7
ACKNOWLEDGEMENT
The authors would like to recognize valuable
contributions of Prof. Nalini Venkatasubra-
maniam and for extending opportunity to
discover nuances of Distributed and Middleware
Systems. Your suggestions played a vital role in
shaping the implementation of the project. We
also express our gratitude to Kyle Benson for his
prompt help and support with resolving our
doubts. We also, take this medium to thank all
the contributors of Publish/Subscribe Systems.
REFERENCES
[1] Ye Zhao, Kyungbaek Kim, and Nalini
Venkatasubramanian. “DYNATOPS: A Dynamic
Topic-based Publish/Subscribe Architecture”
Proceedings of the 7th ACM international
conference on Distributed event-based systems.
[2] Shuping Ji, Chunyang Ye, Jun Wei1 and Hans-
Arno Jacobsen. “Towards Scalable
Publish/Subscribe Systems” 2015 IEEE 35th
International Conference on Distributed
Computing Systems.
[3] Fengliang Qi, Beihong Jin, Haibiao Chen,
Zhenyue Long. “An Efficient Primitive
Subscription Matching Algorithm for RFID
Applications” The 9th International Conference
for Young Computer Scientists.
[4] Jacobsen, H., Cheung, A., Li, G., Maniymaran,
B., Muthusamy, V., & Kazemzadeh, R. S. (n.d.).
The PADRES Pub/Sub System. Principles and
Applications of Distributed Event-Based
Systems. www.msrg.org/projects/padres/
[5] K. R. Jayaram, Patrick Eugster, and Chamikara
Jayalath. “Parametric Content-Based
Publish/Subscribe” ACM Transactions on
Computer Systems (TOCS).
[6] Hou Yuexian, Zhu Honglei, He Pilian. “A
Framework of Feedback Search Engine
Motivated by Content Relevance Mining” WI '06
Proceedings of the 2006 IEEE/WIC/ACM
International Conference on Web Intelligence.
[7] Ludger Fiege, Mariano Cilia, Gero Mühl,
Alejandro Buchmann. “Publish-Subscribe Grows
Up” IEEE Internet Computing Volume 10 Issue 1,
January 2006.
[8] Gupta, A., Sahin, O. D., Agrawal, D., Abbadi,
A. E. (2004). Meghdoot: Content-Based
Publish/Subscribe over P2P Networks.
doi:10.1007/978-3-540-30229-2_14
[9] TIBCO Rendezvous product
http://www.tibco.com/products/automation/e
nterprise-messaging/rendezvous
[10] ArXiv.org cs arXiv:cs/9810019. Retrieved
May 31, 2016
Link: http://arxiv.org/abs/cs/9810019

Group Finder

More Related Content

What's hot

Similar to Group Finder

Recently uploaded

Group Finder