Christophe Romain goes into the details of ejabberd Pubsub implementation. He explains the Pubsub plugin systems and how to leverage it to optimize ejabberd Pubsub for your own use cases.
This talk will teach you how to get more performance and scalability from your Pubsub implementation.
2. PubSub overview
• generic publish-subscribe functionality, specified in XEP-0060
v1.13
• More than 100 pages of specifications
• 12 very detailed use cases with many possibles options and
possible situations: Subscribe, Unsubscribe, Configure
subscription, Retrieve items, Publish item, Delete item, Create
node, Configure node, Delete node, Purge node, Manage
subscriptions, Manage affiliations.
• XEP-0163 v1.2 (PEP) based on PubSub
• XEP-0248 (deprecated) for Collection Nodes
3. History
• Initial implementation from Aleksey Shchepin
(ejabberd author)
• Add ability to organise nodes in a tree back in 2007
by Christophe Romain
• First attempt to create an API plugins in 2007
• Improvements until 2015
4. Implementation
• A poll of iq handlers handled by ejabberd router
• A sending process
• A core router to perform high level actions for every
use case
• Plugins to handle nodes, affiliations/subscriptions,
and items at lower level and interface with data
backend
5. Nodetree plugins
• They handles storage and organisation of PubSub
nodes. Called on get, create and delete node.
tree (default) both internal and odbc backend
virtual (no backend)
dag (to handle XEP-0248)
6. Node plugins
• They handle affiliations, subscriptions and items. They
provide default node configuration and features. Called on
every pubsub use cases.
• Responsible of checks to handle all possibles cases
• Reply action result to PubSub engine and let it handle the
routing
• Many plugins: flat, hometree, pep, dag, public, private, ...,
yours ?
• Few backends: internal, odbc (other possible)
7. Plugin design
• Due to complexity of XEP-0060, PubSub engine do
successive calls to nodetree and node plugins in
order to check validity, perform corresponding
action and return result or appropriate error
• Plugin design follows this requirement and divide
actions by type of data to allow transient backend
implementation without any PubSub engine change
17. node_hometree
• Use exact same features as flat plugin
• Organise nodes in a tree, follows same scheme as
path in filesystem.
• Every user can create nodes in its own home
root: /home/user and/or /home/domain/user
• Each node can contain items and/or sub-nodes.
18. node_pep
• Handles XEP-0163: Personal Eventing Protocol
• Do not persist items
• Just keep last item in memory cache
• Node names are raw namespace attached to a
given bare JID
• Every user can have its own node with a common
namespace sharing with others
19. node_dag
• Handles XEP-0248: PubSub Collection Nodes
• Contribution from Brian Cully
• Every node takes places in a tree and is either a
collection node (have only sub-nodes) or a leaf
node (contains only items)
• No restriction on the tree structure
20. Available backends
• Flat, hometree and PEP supports mnesia and odbc
backend.
• Any derivated plugin can support the same (public,
private, club, buddy...)
• Business Edition also supports in ets and mdb
• Adding backend does not require any PubSub
engine change. Plugin just need to comply API.
21. Storage choices
• nodetree plugin to handle pubsub_node table
• node plugin to handle subscription/affiliation in
pubsub_state (or can even be spread by
implementation) and items in pubsub_item table
• if all nodes shares same configuration, I/O on
pubsub_node can be avoided (nodetree_virtual)
22. Customisation
• Write your own plugin, implements needed functions:
[init/3, terminate/2, options/0, features/0,
create_node_permission/6, create_node/2, delete_node/1,
purge_node/2, subscribe_node/8, unsubscribe_node/4,
publish_item/6, delete_item/4, remove_extra_items/3,
get_entity_affiliations/2, get_node_affiliations/1,
get_affiliation/2, set_affiliation/3,
get_entity_subscriptions/2, get_node_subscriptions/1,
get_subscriptions/2, set_subscriptions/4,
get_pending_nodes/2, get_states/1, get_state/2,
set_state/1, get_items/7, get_items/3, get_item/7,
get_item/2, set_item/1, get_item_name/3, node_to_path/1,
path_to_node/1]
• Generic function must call their corresponding
partner in node_flat
23. Example
• Customize options/0 and features/0 to match your
need using all available features from PubSub
engine. (This triggers the way PubSub controls
calls to plugins)
• implement create_node_permission, for example
check an LDAP directory against an access flag
• Write your own tests on publish or create node,
forbids explicit access to items, etc...
24. Clustering
• Ejabberd's implementation tends to cover most generic and
standard uses. It's good for common use, but far from optimal
for edges or specific cases.
• nodes, affiliations, subscriptions and items are stored in a
replicated database.
• Each ejabberd node have access to all the data.
• Each ejabberd node handles part of the load, but keep locking
database cluster wide on node records write (pubsub_node)
• affiliations, subscriptions and items uses non blocking write
(pubsub_state and pubsub_item)
25. Optimisations
• Take advantage of clustering depending on your needs:
millions of nodes and few subscribers: split nodes over the cluster, use
hash and very few replications. if no configurable nodes, just use virtual
nodetree
few nodes and lot of subscribers: split subscriptions over the cluster, each
ejabberd node only store/handle local subscribers, multi call publish_item
no subscriptions options, remove use of pubsub_subscriptions call from
the plugin
if high publish rate (real time notification): just remove item persistency,
enable memory cache of last item only if needed (pubsub_last_item table).
keep cache replicated or local depending on clustering scheme.
QUESTIONS ?