TIPC
Transparent Inter Process
Communication
by Jon Maloy
TIPC FEATURES
“All-in-one” L2 based messaging service
 Reliable datagram unicast,anycast and multicast
 Connections with...
TIPC == SIMPLICITY
No need to configure or lookup (IP) addresses
 Addresses are always valid - can be hard-coded
 Addres...
THE DOMAIN CONCEPT
32-bit domain identifier
 Assigned to node, cluster and zone
 Structure <Z.C.N> where zero in a posit...
“Well known port number” assigned by developer
 32-bit service type number – typically hard-coded
 32-bit service instan...
No (almost) restrictions on how to bind service addresses
 Different service addresses can bind to same socket
 Same ser...
Server Process
Server Process
All servers bound to the given service address receive a copy
 Delivery and sequentiality g...
Node <1.2.3>
Node <1.1.8>
Node <1.1.1>
Server Process
bind(service = 42,
instance = 2,
domain = 0)
LOCATION TRANSPARENCY
b...
Reliable socket to socket
 Receive buffer overload protection
 No real flow control, messages may still be rejected
Reje...
LIGHTWEIGHT CONNECTION
Established by using service address
 One-way setup (a.k.a. “0-RTT”) using data-carrying messages
...
Node <1.1.1>
Client Process
LINK“L2.5” reliable link layer, node to node
 Guarantees delivery and sequentiality for all m...
NEIGHBOR DISCOVERY
› L2 connectivity determines network
 Neighbor discovery by L2 broadcast, qualified by a lookup domain...
Node <1.1.7>
Node <1.1.8>
Users can subscribe for contents of the global address binding table
 Receives events at each c...
Node <1.1.7>
Node <1.1.8>
Special case of service subscription
 Using same mechanism, - based on service table contents
...
ARCHITECTURE
Socket Socket Socket
Ethernet Infiniband
Media Plugin
VxLAN UDP
Link Link Link Link
Binding Table
Topology
Se...
API
Socket API
 The original TIPC API
 Not shown here
TIPC C API
 Simpler and more intuitive
Python,Perl, Ruby, D
 But...
TIPC C API
/* Addressing:
* - struct tipc_addr may contain <type != 0, instance, domain> or
* <type == 0, instance == port...
PERFORMANCE
Latency times better than onTCP
 ~33% faster thanTCP inter-node
 2 (64 byte messages) to 7 (64 kB messages) ...
WHENTO USE TIPC
TIPC does not replace IP based transport protocols
 It is a complement to be used under certain condition...
WHATTIPC WILL
NOT DO FORYOU
No user-to-user acknowledging of messages
 Only socket-to-socket delivery guaranteed
 What i...
WHO IS USING TIPC?
Ericsson mobile and fix core network systems
 IMS, PGW, SGW, HSS…
 Routers/switches such as SSR, AXE
...
MORE INFORMATION
TIPC project page
http://tipc.sourceforge.net/
TIPC protocol specification
http://tipc.sourceforge.net/do...
Upcoming SlideShare
Loading in …5
×

Introduction to the TIPC Messaging Service

6,443 views

Published on

An overview of the main features and properties of the Transparent Inter Process Communication (TIPC) Messaging Service.
Covers service addressing, neighbor discovery, failure detection, service and topology registry and subscriptions, reliable connectionless and connection oriented tranport, flow control, multipath transport, and more.

Published in: Internet

Introduction to the TIPC Messaging Service

  1. 1. TIPC Transparent Inter Process Communication by Jon Maloy
  2. 2. TIPC FEATURES “All-in-one” L2 based messaging service  Reliable datagram unicast,anycast and multicast  Connections with stream or message transport  Location transparent service addressing  Multi-binding of addresses  Immediate auto-adaptation/failover after network changes Service andTopology tracking function  Nodes, processes, sockets, addresses, connections  Immediate feedback about service availability or topology changes  Subscription/event function for service addresses  Fully automatic neighbor discovery
  3. 3. TIPC == SIMPLICITY No need to configure or lookup (IP) addresses  Addresses are always valid - can be hard-coded  Addresses refer to services - not locations  Address space unique for distributed containers/name spaces No need to configure L3 networks  ButVLANS may be useful… No need to supervise processes or nodes  No more heart-beating  You will learn about changes - if you want to know Easy synchronization during start-up  First, bind to own service address(es)  Second, subscribe for wanted service addresses  Third, start communicating when service becomes available
  4. 4. THE DOMAIN CONCEPT 32-bit domain identifier  Assigned to node, cluster and zone  Structure <Z.C.N> where zero in a position means wildcard (~anywhere/anycast) Domain <1.1.0> (Cluster) Domain <1.0.0> (Zone) Domain <0.0.0> (Global) <1.1.3> <1.1.2> <1.1.1> <1.1.5> Domain <1.1.4> <1.2.5><1.2.4> <1.3.2> <1.3.1> <1.3.4> D o m a i n < 1 . 1 . 0 > Do mai n <1. 0.0 > <1.1.3> <1.1.2> <1.1.1> <1.1.5> Domain <1.1.4> <1.2.5><1.2.4> D o m a i n < 1 . 1 . 0 > Do mai n <1. 0.0 > typedef uint32_t tipc_domain_t; tipc_domain_t tipc_domain(unsigned int zone, unsigned int cluster, unsigned int node); <1.1.3><1.1.4><1.1.3> <1.1.2> Domain <1.1.1> (Node) <1.1.5> <1.1.3><1.2.4><1.2.3> <1.2.2><1.2.1> <1.2.5> <1.1.3><1.3.4><1.3.3> <1.3.2><1.3.1> <1.3.5>
  5. 5. “Well known port number” assigned by developer  32-bit service type number – typically hard-coded  32-bit service instance – typically calculated  32-bit domain identity  Indicating visibility scope on the binding side  Indicating lookup scope on the calling side SERVICE ADDRESSING struct tipc_addr{ uint32_t type; uint32_t instance; tipc_domain_t domain; }; Server Process bind(service = 42, instance = 2, domain = 0) bind(service = 42, instance = 1, domain = 0) Server Process Client Process sendto(service = 42, instance = 2, domain = 0)
  6. 6. No (almost) restrictions on how to bind service addresses  Different service addresses can bind to same socket  Same service address can bind to different sockets  Ranges of service instances can bind to a socket SERVICE BINDING struct tipc_addr{ uint32_t type; uint32_t instance; tipc_domain_t domain; }; Client Process sendto(service = 42, instance = 2, domain = 0) Server Process Server Process bind(service = 42, lower = 1, domain = 0) bind(service = 42, instance = 2, domain = 0) bind(service = 42, instance = 2, domain = 0) bind(service = 4711, lower = 0, upper = 100, domain = 0)
  7. 7. Server Process Server Process All servers bound to the given service address receive a copy  Delivery and sequentiality guaranteed socket-to-socket MULTICAST bind(service = 42, lower = 1, domain = 0) Client Process sendto(service = 42, instance = 2, domain = 0) struct tipc_addr{ uint32_t type; uint32_t instance; tipc_domain_t domain; }; bind(service = 42, instance = 2, domain = 0) bind(service = 42, instance = 2, domain = 0) bind(service = 4711, lower = 0, upper = 100, domain = 0)
  8. 8. Node <1.2.3> Node <1.1.8> Node <1.1.1> Server Process bind(service = 42, instance = 2, domain = 0) LOCATION TRANSPARENCY bind(service = 42, instance = 1, domain = 0) Server Process Client Process sendto(service = 42, instance = 2, domain = 0) Location of server not known by client  Translation service address to physical destination performed on-the-fly at source node  Replica of global binding table on each node  Very efficient hash lookup
  9. 9. Reliable socket to socket  Receive buffer overload protection  No real flow control, messages may still be rejected Rejected messages may be dropped or returned  Configurable in sending socket  Truncated message returned with error code Multicast is just a special case  But messages can not be made returnable RELIABLE DATAGRAM SERVICE Server Process bind(service = 42, instance = 2, domain = 0) bind(service = 42, instance = 1, domain = 0) Server Process Client Process sendto(service = 42, instance = 2, domain = 0)
  10. 10. LIGHTWEIGHT CONNECTION Established by using service address  One-way setup (a.k.a. “0-RTT”) using data-carrying messages  Traditional TCP-style setup/shutdown also available Stream- or message oriented  End-to-end flow control for buffer overflow protection  No sequence numbers, acks or retransmissions, - the link layer takes care of that Breaks immediately if peer becomes unavailable  Leverages link level heartbeats and kernel/socket cleanup functionality  No active “keepalive” heartbeats needed Node <1.1.8> Node <1.1.1> bind(service = 42, instance = 2, domain = 0) Server Process Client Process connect(service = 42, instance = 2, domain = 0)
  11. 11. Node <1.1.1> Client Process LINK“L2.5” reliable link layer, node to node  Guarantees delivery and sequentiality for all messaging  Acts as “trunk” for multiple connections, and keeps track of those  Keeps track of peer node’s address bindings in local replica of the binding table Supervised by heartbeats at low traffic  “Lost service address” events issued for bindings from peer node if no link left  Breaks all connections to peer node if no link left Several links per node pair  Load sharing or active-standby, - but maximum two active  Loss-free failover to remaining link if any Node <1.1.8> bind(service = 42, instance = 2, domain = 0) Server Process connect(service = 42, instance = 2, domain = 0) Client Process connect(service = 42, instance = 2, domain = 0) Client Process Server Process bind(service = 42, instance = 2, domain = 0) Server Process
  12. 12. NEIGHBOR DISCOVERY › L2 connectivity determines network  Neighbor discovery by L2 broadcast, qualified by a lookup domain identity  All qualifying nodes in the same L2 broadcast domain establish mutual links  One link per interface, maximum two active links per node pair  Each node has its own view of its environment <1.1.3><1.1.4><1.1.3> <1.1.2><1.1.1> <1.1.5> <1.1.3><1.2.4><1.2.3> <1.2.2><1.2.1> <1.2.5> <1.1.3><1.3.4><1.3.3> <1.3.2><1.3.1> <1.3.5>
  13. 13. Node <1.1.7> Node <1.1.8> Users can subscribe for contents of the global address binding table  Receives events at each change matching the subscription There is a match when  Bound/unbound instance or range overlaps with subscribed range SERVICE SUBSCRIPTION Server Process bind(service = 42, instance = 2, domain = 0) bind(service = 42, instance = 1, domain = 0) Server Process Client Process subscribe(service = 42, lower = 0, upper = 10)
  14. 14. Node <1.1.7> Node <1.1.8> Special case of service subscription  Using same mechanism, - based on service table contents  Represented by the built-in service type zero (0 ~ “node availability”) TOPOLOGY SUBSCRIPTION Client Process subscribe(service = 0, lower = 0, upper = ~0)
  15. 15. ARCHITECTURE Socket Socket Socket Ethernet Infiniband Media Plugin VxLAN UDP Link Link Link Link Binding Table Topology Service Node Node Link Node C Library L2/External: Carrier Media L2/Internal: Fragmentation/Bundling/ Retransmission/Congestion Control L3: Destination Lookup L4: Connection Handling, Flow Control Node Table User Land Python Socket Link L2/Internal: Link Aggregation/ Synchronization/Failover/ Neighbor Discovery/Supervision
  16. 16. API Socket API  The original TIPC API  Not shown here TIPC C API  Simpler and more intuitive Python,Perl, Ruby, D  But not yet for Java ZeroMQ  Not yet with full features
  17. 17. TIPC C API /* Addressing: * - struct tipc_addr may contain <type != 0, instance, domain> or * <type == 0, instance == port, domain == node> * - tipc_domain_t has internal structure <zone.cluster.node>. Valid * contents <z.c.n>, <z.c.0>, <z.0.0> or <0.0.0>, 0 is wildcard */ typedef uint32_t tipc_domain_t; struct tipc_addr { uint32_t type; uint32_t instance; tipc_domain_t domain; }; tipc_domain_t tipc_own_node(void); tipc_domain_t tipc_own_cluster(void); tipc_domain_t tipc_own_zone(void); tipc_domain_t tipc_domain(unsigned int zone, unsigned int cluster, unsigned int node); /* Socket: * - 'Rejectable': sent messages will return if rejected at dest */ int tipc_socket(int sk_type); int tipc_sock_non_block(int sd); int tipc_sock_rejectable(int sd); int tipc_close(int sd); int tipc_sockid(int sd, struct tipc_addr *id); int tipc_bind(int sd, uint32_t type, uint32_t lower, uint32_t upper, tipc_domain_t scope); int tipc_connect(int sd, const struct tipc_addr *dst); int tipc_listen(int sd, int backlog); int tipc_accept(int sd, struct tipc_addr *src); /* Messaging: * - NULL pointer parameters are always accepted * - tipc_sendto() to an accepting socket initiates two-way connect * - If no err pointer given, tipc_recvfrom() returns 0 on rejected message * - If (*err != 0) buffer contains a potentially truncated rejected message */ int tipc_recvfrom(int sd, char *buf, size_t len, struct tipc_addr *src, struct tipc_addr *dst, int *err); int tipc_recv(int sd, char* buf, size_t len, bool waitall); int tipc_send(int sd, const char *msg, size_t len); int tipc_sendmsg(int sd, const struct msghdr *msg); int tipc_sendto(int sd, const char *msg, size_t len, const struct tipc_addr *dst); int tipc_mcast(int sd, const char *msg, size_t len, const struct tipc_addr *dst); /* Topology Server: * - Expiration time in [ms] * - If (expire < 0) subscription never expires */ int tipc_topsrv_conn(tipc_domain_t topsrv_node); int tipc_srv_subscr(int sd, uint32_t type, uint32_t lower, uint32_t upper, bool all, int expire); int tipc_srv_evt(int sd, struct tipc_addr *srv, bool *available, bool *expired); bool tipc_srv_wait(const struct tipc_addr *srv, int expire); int tipc_neigh_subscr(tipc_domain_t topsrv_node); int tipc_neigh_evt(int sd, tipc_domain_t *neigh_node, bool *available); int tipc_link_subscr(tipc_domain_t topsrv_node); int tipc_link_evt(int sd, tipc_domain_t *neigh_node, bool *available, int *local_bearerid, int *remote_bearerid); http://sourceforge.net/p/tipc/tipcutils/ci/master/tree/demos/c_api_demo/tipcc.h
  18. 18. PERFORMANCE Latency times better than onTCP  ~33% faster thanTCP inter-node  2 (64 byte messages) to 7 (64 kB messages) times faster intra-node messaging  TIPC doesn’t use the loopback interface Throughput still somewhat lower thanTCP  ~65-90 % of maxTCP throughput inter-node  Seems to be environment dependent  But 25-30% better thanTCP intra-node
  19. 19. WHENTO USE TIPC TIPC does not replace IP based transport protocols  It is a complement to be used under certain conditions  It is an IPC! TIPC may be a good option if you  Want startup synchronization for free  Have application components that need to keep continuous watch on each other  Need short latency times  Traffic is heavily intra node  Don’t want to bother with configuration  One L2 hop is enough between your components  Are inside a security perimeter
  20. 20. WHATTIPC WILL NOT DO FORYOU No user-to-user acknowledging of messages  Only socket-to-socket delivery guaranteed  What if the user doesn’t process the message?  On the other hand, which protocol does? No datagram transmission flow control  For unicast, anycast and multicast  Must currently be solved by user  We are working on the problem… No routing  Only nodes on same L2 network can communicate  But a node may attach to several L2 networks
  21. 21. WHO IS USING TIPC? Ericsson mobile and fix core network systems  IMS, PGW, SGW, HSS…  Routers/switches such as SSR, AXE  Hundreds of installed sites  Tens of thousands of nodes  Tens of millions of subscribers WindRiver  Mission critical system for Sikorsky Aircraft’s helicopters Cisco  onePK, IOS-XE Software, NX-OS Software Huawei  OpenStack Mirantis  OpenStack Numerous other companies and institutions
  22. 22. MORE INFORMATION TIPC project page http://tipc.sourceforge.net/ TIPC protocol specification http://tipc.sourceforge.net/doc/draft-spec-tipc-10.html TIPC programmer’s guide http://tipc.sourceforge.net/doc/tipc_2.0_prog_guide.html TIPC C API http://sourceforge.net/p/tipc/tipcutils/ci/master/tree/demos/c_api_demo/tipcc.h

×