Presence De Facto Usage
• one presence for one user
• we don’t need priority
• ‘chat’ and ‘away’ are enough for ‘show’
param
Resource Binding
• node@domain/resource
• you can choose ‘home’ or ‘work’
• de facto usage
• users and client softwares don’t mind
what the resource is.
• so server generates random string (like a
cookie )
S2S Problems
• We have to manage new database that
represents ‘relationships’ between users of
our service and ones of other service.
• We have to manage a white list or spam
control. ( like OpenID )
• Authentication Compatibility Cost
• Scaling Cost
• Error Handling
Why subset?
• Ocean is just for Web-based Social
Network Service
• Create/Modify friendships on the web
site.
• S2S costs too much
• Developers have to write only minimum
number of event handlers.
Protocol Customized
• resource binding & roster/vCard responses
support extra-attributes.
• Because of regardless of HTTP
federation, especially URL
• WebSocket binding (experimental )
• try demo application
• Simple-MUC
Not flexible and pluggable?
• It may cause complication. requires detailed
protocol knowledge, framework
knowledge.
• When the servers are distributed, frontserver-plugins require complicated spec.
See ‘Cluster’ section.
• There are de-facto-standard specs.
Why full scratch?
• Proprietary Reason
• We may need non-standard features
• Rapid Updating (community based
development requires many arguments)
• Architectural Dependencies
• No Money
Pay attention to
memory usage
• 4byte on a Stream object
• 4x10,000byte on a Server witch has 10,000
connection
• bless Array instead of Hash
• use Config instance as singleton object
• use delegate object for event handling
Handler uses
2 or 3 kinds of store
• Persistent Data Store (users, relationships)
• Half Persistent Data Store. Same lifecycle as
XMPP service (connection map)
• Cache system ( if needed )
Presence doesn’t scale
• user_001 login-ed ( send initial-presence )
• get all friends of user_001
• get all available-resources of the friends
• send user_001’s presence to all the
available-resources.
• send presences of the all availableresources to user_001
It’s important to limit
the maximum number of friends.
Cluster
• C10K Problem on Persistent Connection
• Blocking IO multiplexing
• Message delivery/Presence broadcast
Message Delivery
Connection Map
User A
User A
User B
User C
Home
Work
Home
Home
1
2
1
2
Delivery Server
XMPP Server 2
XMPP Server 1
User A
(Home)
User B
(Home)
User A
(Work)
User C
(Home)
Use 2 kinds of gearman
• Background Job Dispatcher
• Inbox ( 1 inbox for each front-server )
Message Delivery
Connection Map
User A
User A
User B
User C
Home
Work
Home
Home
inbox1
(gearmand)
XMPP Server 1
1
2
1
2
Delivery Server
Background Job
Dispatcher
(gearmand)
inbox2
(gearmand)
XMPP Server 2
Message Delivery
Connection Map
User A
User A
User B
User C
Home
Work
Home
Home
inbox1
(gearmand)
XMPP Server 1
1
2
1
2
Delivery Server
Background Job
Dispatcher
(gearmand)
inbox2
(gearmand)
XMPP Server 2
Message Delivery
Connection Map
User A
User A
User B
User C
Home
Work
Home
Home
inbox1
(gearmand)
XMPP Server 1
1
2
1
2
Delivery Server
Background Job
Dispatcher
(gearmand)
inbox2
(gearmand)
XMPP Server 2
Connection Map
• too many ‘read’
• ‘write’ on resource-binding, presencechange
• Implementation
• TokyoTyrant
• HandlerSocket
• must not be volatile
Worker side (delivery server)
components
Process Manager
Connection Map
Handler
User A
User A
User B
User C
Serializer
1
2
1
2
Deliverer
Subscriber
DB (users, relationships, etc...)
Worker
Background Job
Dispatcher
(gearmand)
Home
Work
Home
Home
inbox0
(gearmand)
inbox1
(gearmand)
Update & Reboot
• Worker side handler includes service-code,
so it’s required updating and rebooting
frequently. Worker’s job is stateless. So, it’s
no problem.
• Front-end server doesn’t include service-
code. So you don’t need to reboot it
frequently, but when you want to do it, you
have to pay attention.
Shutdown
• You can’t shutdown server without client
awareness.
• You have to write re-connection on client
side code
• Problem: algorithm distribution case
• At graceful shutdown, unavailable presences
are handled correctly
• Problem: at non-graceful shutdown
Methods
• Polling ( Ajax ) + Post API (+APNS|C2DM)
• Long Polling ( Ajax, Comet ) + Post API
• ServerSentEvent + Post API
• WebSocket
• TCP XMPP
‘Stateless’ problems
• Different way of connection management
from ordinary persistent connection
• single user-session may open many tabs
• user moves in short time. too many reconnection
• how to detect user to leave? (leave or
move)
Connection Map
User A
User A
User B
User C
Home
Work
Home
Home
1
2
1
2
too many ‘write’ and ‘presence’
in short time
server 1
open many tabs or
hop http-link
in short time
Tab
User A
server 2
server 3
mod_proxy(_balancer)
Tab
Tab
User B
when user leaves?
Solution 1
Algorithm Based Distribution
• ユーザー毎の宛先を固定することによっ
て、Connection Mapを利用しなくても、
特定のユーザーにパケットを届けるこ
とは可能になる
• ユーザーがWebサービスを開いてるか
どうかは関係なくなる
Connection Map
User A
User A
User B
User C
Aggregate
connections
as single session
hb1.mixi.jp
Home
Work
Home
Home
1
2
1
2
hb2.mixi.jp
Manager
hb3.mixi.jp
Manager
use timer to detect
users to leave.
Manager
until the expiration,
server should buffer
messages if received
algorithm based distribution: server_node=algorithm(user_id)
Tab
Tab
Tab
Tab
User A
Tab
Tab
User B
User C
(*) number of simultaneous Ajax request directed to same domain is limited,
so we need many nickname for each hb servers. (if we adopt long polling)
Handle Multiple Streams as One Single Session
Handler
Server
Session
Stream Manager
ResourceBinding
SessionEstablishment
Session
InitialPresence
User A
Stream
User A
Stream
User B
Stream
User B
Stream
Handshake
Auth
Tasks
• We need to
• prepare social ad system
• improve UI (moveless)
PubSub Support
• XEP-0061 Publish Subscribe
• We don’t need full-spec support
• don’t allow use to publish/subscribe
• server publish event
Publish Event from Web Service
Delivery Server
XMPP Server 1
User A
(Home)
User B
(Home)
UserEvents
(Notification,
Activity)
Web Service
XMPP Server 2
User A
(Work)
User C
(Home)
Overview
• XMPP has MUC extension spec. but it’s
over spec, so we need arrangement
• Costs too much like presence
• limit numbers of channels and
participants
• Static-type groupchat should keep
synchronicity with ‘group(room?)’ feature
on our service
Hole Punching
1. Local host connects to an external host
2. NAT allocates a transport address to localhost
10.100.100.100:10002
NATs
MAPPING
Local
Host
192.168.0.1:10001
Hole Punching
1. Local host connects to an external host
2. NAT allocates a transport address to localhost
10.100.100.100:10002
NATs
MAPPING
Local
Host
192.168.0.1:10001
Hole Punching
1. Local host connects to an external host
2. NAT allocates a transport address to localhost
3. An external host can send packet to local host
through mapped global address
NATs
Local
Host
But every NAT doesn’t work as expected
Find Friends
• search service by DNS PTR records
• search host for indicated service by SRV
records
• search IP address by A record
• (TXT records for capability and status)