Enhanced response time: subscribers to these services do not need to wait for their queries to be processed, instead, the results will be delivered to them upon availability and when their filtering options are met.Enhanced results: query results are only related to filter criteria subscribers are interested in, therefore allowing subscribers to filer the huge amount of available information Database resources utilization and increased capacity: instead of processing user queries one at a time, multiple subscriptions can be processed together resulting in increased query processing time and increased overall capacity of the database.Loosely coupled relationship between publishers and subscribers: publishers and subscribers do not need to know each other, in which their identity can remain anonymous and only the notification system is the one responsible of managing them. In addition, system topology is not needed to be known for either publishers or subscribers.Scalability: publish-subscribe model is highly scalable for small systems that provides parallel and multiple query processing, caching capabilities for messages and routing functionalities. For more large and complex publish-subscribe implementation, this become a challenge that needs more research effort.
events or pattern of events
Space decoupling: in which publishers and subscribers do not need to know each other. as we stated before, the integration between publishers and subscribers is done by the notification service and hence publishers do not know who and how many subscribers to this events and subscribers do not know who and how many publishers in this event.Time decoupling: in which publishers and subscribers do not need to be running at the same time. For examples, when notifications are published, some subscribers can be off at that time and when they are up, they will receive these notifications later in time while publishers are off for example.Synchronization decoupling: in which publishers and subscribers operations and tasks are not halted during publishing and receiving notifications. Notifications are delivered asynchronously to subscribers when these events occur.The result of decoupling between publishers and subscribers is a scalable system that removes dependencies between these parties and makes this model fit very well in distributed systems.
They have some similarities with this model but they fail to be fully decoupled between publishers and subscribers in terms of time, space and synchronization.
They have some similarities with this model but they fail to be fully decoupled between publishers and subscribers in terms of time, space and synchronization.RPC enhanced later to avoid synchronization issues by implementing another version that make the interaction made asynchronously without returning any acknowledgment messages which reduces reliability. To overcome this, another approach was proposed with acknowledgments that only accessed when needed.
There are a different number of subscription models to which a subscribers shows interest in some events and how they are filtered to match these interest. The degree to which these events can be filtered to match subscribers interests are highly related to the expressive power of the subscription language used.
In traditional request/response querying of data, clients usually pull information from the database to their applications. Another method is to push information from the server to the client like news services. Since most transport protocols like HTTP and TCP do not support this, a smart pull can be used by installing a system service in the background asking to pull new information.
A guarantee of successful delivery is needed since publishers and subscriber interact asynchronously with each other.since publish-subscribe systems work with a large, dynamic community of publishers and subscribers using various heterogeneous platforms, security issues should be carefully addresses and assessed in terms of authentication, confidentiality, integrity and accountability.
To store information about values stored in the condition predicates since expressions are usually not self-descriptive and any values stored in the condition can produce different results based on the type of the values
To sum up, expressions allows us to list all subscribers of an event using a single query that can be scaled up to include more subscribers with same interests. In addition, it allows us to use relational databases and utilize their capabilities in constructing expression queries.
Continuous query is a new type of queries that is constructed only once and stored for continuous use over the database. Continuous queries were first introduced to support Tapestry systems which are systems that store electronic documents such as emails and news articles in a database. Additional information about authors, title, date and other keywords are stored too. Continuous query is used over append-only databases in which new added documents will remain in the database and never removed. TQL allows users to run their queries against the database refine it until they are satisfied with it and then store it as a continuous query
Non-deterministic results: which means that results from executing queries are dependent on the execution time of the query. If the same query executed over different period of times, different results will be obtained.Duplicate: as Continues query is executed over append-only databases, all old and new results will be returned to users although they might be interested only with new ones.Inefficiency of the system. Since all new and old data is retrieved each time, large amount of data will be returned each time resulting in more executing time and degradation in the system performance.
It allows us to use relational databases to execute queries more efficiently with enhanced performance.It supports time-oriented queries without the need to use triggers.Flexibility provided to execute scheduled queries on a user-preference basis.
Publish-Subscribe Model Overview Presented by: Ishraq Fatafta
AgendaO Introduction.O Publish-Subscribe model overview.O Publish-Subscribe a database perspective: O Expressions. O Continuous query. O XML.O Conclusion
IntroductionO Traditional system-centric approach O Request/Response query of data. O Data volume and response time.O Data-centric approach O Publishers O Subscribers O Notification system
Introduction Cont.O Publish-Subscribe model advantages: O Enhanced response time. O Enhanced results. O Database resources utilization and increased capacity. O Loosely coupled relationship between publishers and subscribers. O Scalability.
Publish-Subscribe Model OverviewO Described as events or pattern of events produced by publishers that subscribers interested in and notified when they are available.O Information has been referred to as Notifications in this paradigm.O Subscribers can continue their tasks until the notification service delivers notifications.
Publish-Subscribe Basic Model Overview Cont. O Decoupling types between publishers and subscribers: O Space decoupling: in which publishers and subscribers do not need to know each other. O Time decoupling: in which publishers and subscribers do not need to be running at the same time. O Synchronization decoupling: in which publishers and subscribers operations and tasks are not halted during publishing and receiving notifications. O Scalable system that fits well in distributed systems.
Publish-Subscribe Model Overview Cont.O Other communication models existed aside from publish-subscribe model: O Message passing: O Relies on messages for establishing communication between the sender and the receiver. O Message production done Asynchronously. O Message consumption done Synchronously. O Both need to be available in the same time. O Not decoupled in terms of time and space
Publish-Subscribe Model Overview Cont.O Other communication models existed aside from publish-subscribe model: O Remote call procedure (RPC): O Intends to make remote interactions looks the same as local interactions. O Coupled in time, space and synchronization. O Notifications: O Notifications sent by client to the server including callback arguments. O Notifications sent by server to the client including the result. O Coupled in time and space.
Publish-Subscribe Model Overview Cont.O Other communication models existed aside from publish-subscribe model: O Shared space: O Based on tuple-space: ordered collection of tuples accessed by all parties. O Adding and deleting tuples from tuple-space Synchronously. O Decouple time and space. O Message queuing: O Uses tuple-space, queues are provided with messages from producers and additional transactional, ordering and timing functionalities are provided by the message queue. O Same as Shared space.
Subscription modelsO Topic-based subscription model O Also referred to as Subject-based models. O subscriber shows interest in a particular topic and receives notifications filtered based on that. O Similar to joining to a group but more dynamic. O Hierarchy based. O Limited amount of expressions provided for subscribers to filter and limit their interested criteria. O Subscribe to more than one topic in a single subscription.
Subscription models Cont.O Content -based subscription model O Bound to the content of events themselves rather than external criteria. O Subscription language is used for filtering O CarBrand = „Mercedes‟ and Price <= 20,000 O StockName = „T*‟ and change > 3 O Needs more expressive criteria to determine which will generate a lot of traffic on the network. O More advanced and complex notification system to be able to filter each event and extract subscriptions
Subscription models Cont.O Type-based subscription model O Built using concepts from Object-Oriented. O Events are objects that can hold attributes and methods and notifications are objects of specific type. O Subscribers of specific object types will only receive instances of that type or its sub-types. O Performance issues when a large amount of events that need to be processed all at runtime.
Subscription related characteristicsO Push and PullO Time driven and data drivenO Full update and incremental updateO Broadcast and unicast data delivery
Quality measures of publish- subscribe servicesO Quality measures and metrics when designing any publish-subscribe model: O Reliability. O Security. O Priority. O Latency.
Publish-subscribe with expressionsO Boolean expression used to specify subscribers‟ interest in an event by filtering their criteria using name-value, comparison operators (=, >, >=, <, <=) and regular expressions.O We will use SQL and relational database.
Publish-subscribe with expressions Cont.O Example: Interested in cars for saleO Brand Cadillac and price less than 35000O Rules : ON Car4Sale IF (Model = ‘Cadillac’ and Price < 35000) THEN notify(‘email@example.com’)
Publish-subscribe with expressions Cont. SubscriberI Address … Interest D … 100 Amman … Model = „Cadillac‟ and …. Price < 35000 101 Irbid … Model = „Mercedes‟ …. and Year > =2007SELECT * FROM [SUBSCRIBERS]WHEREEVALUATE(SUBSCRIBERS.Interest, <DATA ITEM>) = 1
Publish-subscribe with expressions Cont.O Queries can be simple, complex, with any type of join.O Publishers can put limitations on predicates. SELECT * FROM SUBSCRIBERS WHERE EVALUATE(SUBSCRIBERS. Interest, <CAR DETAILS>) = 1 AND SUBSCRIBERS.Address = „Amman‟ ORDER BY SUBSCRIBERS.SubscriberID DESC
Publish-subscribe with expressions Cont.O Storing expressions as Table data O Store these conditions as data in special type columns. O Metadata is needed O To store information about values stored in the condition predicates. O A list of built-in and user-defined functions referenced by the condition. O Validate values stored when new or existing columns are modified. O Indexes can be added.
Publish-subscribe with expressions Cont.O Evaluating expressions O Evaluate operator is new to SQL. O Conditional expression is translated into a WHERE condition in SQL. O Expression Metadata used to determine the structure of the FROM clause. O The result returned is 1 (true) when the condition is satisfied.
Publish-subscribe with expressions Cont. SELECT DISTINCT SUBSCRIBERS.SubscriberID, (CASE WHEN SUBSCRIBERS.annual_income > 100000 THEN notify_salesperson (SUBSCRIBERS.PhoneNumber) ELSE create_email_msg (SUBSCRIBERS.EmailAddress ) FROM SUBSCRIBERS, INVENTORY WHERE EVALUATE(SUBSCRIBERS.Interest, <car details FROMINVENTORY>) = 1 AND Sub_DISTANCE(SUBSCRIBERS.Address,:DealerLoc,‟distance=50‟) = ‟TRUE‟ Group BY SUBSCRIBERS.SubscriberID
Continuous QueryO Queries constructed only once and stored for continuous use over the database.O Used over Append-Only databases.O First used to support Tapestry systems.O Uses time-based approach rather than triggers.O Uses a language called TQL (Tapes-try query language).
Continuous Query Cont.FOREVER DO Execute Query Q Return results to userSleep for some period of time.ENDLOOP
Continuous Query Cont.O Continuous query suffers some dis-efficiency: O Non-deterministic results. O Duplicate. O Inefficiency of the systemO To overcome this: O Incremental queries which run periodically. O Has two timestamps: last execution time (t) and current time (T). O Only results in the period (T-t) are returned.
Continuous Query Cont.O Incremental queries: Set T. –∞ FOREVER DO set t:= current time Execute query Q (z, t) Return result to user set T:= t Sleep for some period of time ENDLOOP
Continuous Query Cont.O To overcome duplicates, queries are transformed into Monotone queries.O Queries whom results are not increased as new tuples added to the database SELECT * FROM tbl WHERE tbl.field = “test” AND tbl.ts < t SELECT m.msgid FROM m WHERE NOT EXISTS( SELECT * FROM m ml WHERE ml.inreplyto = m.msgid AND t< ml.ts + 2 weeks )
XMLO Importance of XML as a standard information exchange mechanism.O Its capabilities of encoding structural information in documents.O Using XML in creating user profiles.O Using XFilters, which is a mechanism that matches XML documents to user profiles and relational databases, matched documents are returned using XPath to interested users .
XML Cont.O XPath query is decomposed into a set of path nodes using XPath parser.O Tags are extracted from these nodes and stored in a TagPath table.O linear path is extracted from user subscription XPath profile and stored in the LinearPath table.O TagPath table is used to match linear paths in users‟ subscriptions with TagPath from XML documents.
XML Cont.O SQL query can perform any DML operation on them.O SQL query is ran recursively to match XML messages with subscriptions.O Values of stored path tags are used as predicates in the join
ConclusionO Publish-subscribe system consists of: O publishers, who wish to disseminate messages in a form of events to interested users, O Subscribers, who wish to be notified with these events by subscribing to them. O Notification management system that maintains a database with all publishers and subscribers.
Conclusion Cont.O Database is used to match events and subscriptions by evaluating events based on: O Expressions provided by subscribers in a form of queries stored in the database. O Continuous queries that target Append-only databases using a time-based approach. O XML that uses XFilters to match XML documents against user profiles, filter and return them using SQL queries in relational databases and XPath queries.