• Data Problems in Communications
• Store primitive values by different byte order (Big endian, Little endian)
• Different code sets for OS (UTF-8, CP949, etc…)
• enable any two computers to exchange binary data values
• The values are converted to an agreed external format before transmission and converted to the local
form on receipt
• The values are transmitted in the sender’s format, together with an indication of the format used, and
the recipient converts the values if necessary.
• An agreed standard for the representation of data structures and primitive values is
called an external data representation.
• Marshalling is the process of taking a collection of data items and assembling them
into a form suitable for transmission in a message.
• Three alternative approaches to external data representation and marshalling
• CORBA’s common data representation (CDR)
• Java’s object serialization
• XML (Extensible Markup Language)
• Marshalling should be handled in software, not directly by the programmer.
• Because marshalling takes into account all the details of a composite object, it is prone to errors if
done manually.
• Design issue of marshalling : file format, type information
• CORBA CDR is the external data representation defined
with CORBA 2.0 [OMG 2004a].
• It consists of primitive type and constructed type.
Type Representation
sequence length(unsignedlong) followedby elements in order
string length(unsignedlong) followedby characters in order (can also
can have widecharacters)
array arrayelements in order (no length specified becauseit is fixed)
struct in theorder of declaration of thecomponents
enumerated unsignedlong(thevalues are specifiedby theorder declared)
union type tag followed by the selected member
The flattened form represents a Person struct with value: {‘Smith’, ‘London’, 1984}
0–3
4–7
8–11
12–15
16–19
20-23
24–27
5
"Smit"
"h___"
6
"Lond"
"on__"
1984
index in
sequence of bytes 4 bytes
notes
on representation
length of string
‘Smith’
length of string
‘London’
unsigned long
• Type of a data item not given: assumed sender and recipient have common
knowledge of the order and types of data items
• Types of data structures and types of basic data items are described in CORBA IDL
• Provides a notation for describing the types of arguments and results of RMI methods
• Both objects and primitive data values may be passed as arguments and results of
method invocations.
• The following Java class is equivalent to Person struct
• Serialization: flattening objects into a serial form for storing on disk or transmitting in
a message.
• Assumed has no prior knowledge of the types of the objects in the serialized form
• Some information about the class of each object is included in the serialized form
• Java objects can contain references to other objects.
• All objects it references are serialized
• References are serialized as handles
• A handle is a reference to an object within the serialized form
• Each object is written once only
• Handle is written in subsequent occurrences
1. its class info is written out: name, version number
2. types and names of instance variables
* If an instance variable belong to a new class, then new class info must be
written out, recursively.
* Each class is given a handle
3. values of instance variables
Ex) Person p = new Person("Smith", "London", 1984);
The true serialized form contains additional type markers; h0 and h1 are handles
Serialized values
Person
3
1984
8-byte version number
int year
5 Smith
java.lang.String
name:
6 London
h0
java.lang.String
place:
h1
Explanation
class name, version number
number, type and name of
instance variables
values of instance variables
• XML is a markup language that was defined by the World Wide Web Consortium
(W3C) for general use on the Web.
• XML data items are tagged with ‘markup’ strings.
• XML is used to enable clients to communicate with web services and for defining the interfaces and
other properties of web services.
• XML is extensible in the sense that users can define their own tags
• XML documents, being textual, can be read by humans.
• XML elements and attributes
• Elements: An element in XML consists of a portion of character data surrounded by matching start and
end tags.
• Attributes: A start tag may optionally include pairs of associated attribute names and values such as
id="123456789", as shown above.
• Parsing and well-formed documents
• Well formed rule
• every start tag has a matching end tag
• all tags are correctly nested
• Every XML document must have a single root element
• CDATA : How to represent special characters
• XML prolog : version, encoding
• XML namespaces : a set of names for a collection of element types and attributes
• referenced by a URL
• allows an application to make use of multiple sets of external definitions in different namespaces
without the risk of name clashes.
• XML schemas : Define element shape
• defines the elements and attributes that can appear in a document
• how the elements are nested and the order and number of elements, and whether an element is
empty or can include text
• Each process contains objects, some of which can receive remote invocations, others
only local invocations.
• Those that can receive remote invocations are called remote objects
• Objects need to know the remote object reference of an object in another process in
order to invoke its methods.
• A remote object reference is an identifier for a remote object that is valid throughout
a distributed system.
• Remote object references must be generated in a manner that ensures uniqueness
over space and time.
• Even if remote object is deleted, it is important that the remote object reference is not reused
• Example of unique remote object reference
• Concatenate Internet address of its computer and the port number of the process that created it with
the time of its creation and a local object number
• Multicast communication is appropriate model for communication from one process
to a group of other processes.
• Four cases where multicast messages are used
• Fault tolerance based on replicated services
• Discovering services in spontaneous networking
• Better performance through replicated data
• Propagation of event notifications
• IP multicast
• built on top of the Internet Protocol
• allows the sender to transmit a single IP packet to a set of computers that form a multicast group.
• multicast group is specified by a Class D Internet address
• following details are specific to IPv4
• Multicast routers
• Multicast address allocation
• Multicast Routers
• IP packets can be multicast both on a local network and on the wider Internet.
• Local multicasts use the multicast capability of the local network, for example, of an Ethernet.
• Internet multicasts make use of multicast routers, which forward single datagrams to routers on other
networks, where they are again multicast to local members.
• To limit the distance of propagation of a multicast datagram, the sender can specify the number of
routers it is allowed to pass – called the time to live, or TTL for short.
• Multicast address allocation
• Class D addresses (that is, addresses in the range 224.0.0.0 to 239.255.255.255) are reserved for
multicast traffic
• Local network Control Block, Internet Control Block, Ad Hoc Control Block, Administratively Scoped
Block
• Multicast addresses may be permanent or temporary
• When a temporary group is created, it requires a free multicast address to avoid
accidental participation in an existing group.
• The IP multicast protocol does not directly address this issue.
• If used locally, setting the TTL to a small value, making collisions with other groups unlikely.
• Failure model for multicast datagrams
• Datagrams multicast over IP multicast have the same failure characteristics as UDP datagrams
• Unreliable multicast, because it does not guarantee that a message will be delivered to any member of
a group.
• Failure feature of Unreliable IP Multicast
• A datagram sent from one multicast router to another may be lost
• recipients may drop the message because its buffer is full.
• the effect of the failure semantics of IP multicast on the four examples
• Fault tolerance based on replicated services
• Discovering services in spontaneous networking
• Better performance through replicated data
• Propagation of event notifications
• The examples also suggest that some applications have strong requirements for
ordering, the strictest of which is called totally ordered multicast,
[Distributed System] ch4. interprocess communication

[Distributed System] ch4. interprocess communication

  • 2.
    • Data Problemsin Communications • Store primitive values by different byte order (Big endian, Little endian) • Different code sets for OS (UTF-8, CP949, etc…) • enable any two computers to exchange binary data values • The values are converted to an agreed external format before transmission and converted to the local form on receipt • The values are transmitted in the sender’s format, together with an indication of the format used, and the recipient converts the values if necessary. • An agreed standard for the representation of data structures and primitive values is called an external data representation.
  • 3.
    • Marshalling isthe process of taking a collection of data items and assembling them into a form suitable for transmission in a message. • Three alternative approaches to external data representation and marshalling • CORBA’s common data representation (CDR) • Java’s object serialization • XML (Extensible Markup Language) • Marshalling should be handled in software, not directly by the programmer. • Because marshalling takes into account all the details of a composite object, it is prone to errors if done manually. • Design issue of marshalling : file format, type information
  • 4.
    • CORBA CDRis the external data representation defined with CORBA 2.0 [OMG 2004a]. • It consists of primitive type and constructed type. Type Representation sequence length(unsignedlong) followedby elements in order string length(unsignedlong) followedby characters in order (can also can have widecharacters) array arrayelements in order (no length specified becauseit is fixed) struct in theorder of declaration of thecomponents enumerated unsignedlong(thevalues are specifiedby theorder declared) union type tag followed by the selected member
  • 5.
    The flattened formrepresents a Person struct with value: {‘Smith’, ‘London’, 1984} 0–3 4–7 8–11 12–15 16–19 20-23 24–27 5 "Smit" "h___" 6 "Lond" "on__" 1984 index in sequence of bytes 4 bytes notes on representation length of string ‘Smith’ length of string ‘London’ unsigned long
  • 6.
    • Type ofa data item not given: assumed sender and recipient have common knowledge of the order and types of data items • Types of data structures and types of basic data items are described in CORBA IDL • Provides a notation for describing the types of arguments and results of RMI methods
  • 7.
    • Both objectsand primitive data values may be passed as arguments and results of method invocations. • The following Java class is equivalent to Person struct
  • 8.
    • Serialization: flatteningobjects into a serial form for storing on disk or transmitting in a message. • Assumed has no prior knowledge of the types of the objects in the serialized form • Some information about the class of each object is included in the serialized form • Java objects can contain references to other objects. • All objects it references are serialized • References are serialized as handles • A handle is a reference to an object within the serialized form • Each object is written once only • Handle is written in subsequent occurrences
  • 9.
    1. its classinfo is written out: name, version number 2. types and names of instance variables * If an instance variable belong to a new class, then new class info must be written out, recursively. * Each class is given a handle 3. values of instance variables Ex) Person p = new Person("Smith", "London", 1984); The true serialized form contains additional type markers; h0 and h1 are handles Serialized values Person 3 1984 8-byte version number int year 5 Smith java.lang.String name: 6 London h0 java.lang.String place: h1 Explanation class name, version number number, type and name of instance variables values of instance variables
  • 10.
    • XML isa markup language that was defined by the World Wide Web Consortium (W3C) for general use on the Web. • XML data items are tagged with ‘markup’ strings. • XML is used to enable clients to communicate with web services and for defining the interfaces and other properties of web services. • XML is extensible in the sense that users can define their own tags • XML documents, being textual, can be read by humans.
  • 11.
    • XML elementsand attributes • Elements: An element in XML consists of a portion of character data surrounded by matching start and end tags. • Attributes: A start tag may optionally include pairs of associated attribute names and values such as id="123456789", as shown above. • Parsing and well-formed documents • Well formed rule • every start tag has a matching end tag • all tags are correctly nested • Every XML document must have a single root element • CDATA : How to represent special characters • XML prolog : version, encoding
  • 12.
    • XML namespaces: a set of names for a collection of element types and attributes • referenced by a URL • allows an application to make use of multiple sets of external definitions in different namespaces without the risk of name clashes.
  • 13.
    • XML schemas: Define element shape • defines the elements and attributes that can appear in a document • how the elements are nested and the order and number of elements, and whether an element is empty or can include text
  • 14.
    • Each processcontains objects, some of which can receive remote invocations, others only local invocations. • Those that can receive remote invocations are called remote objects • Objects need to know the remote object reference of an object in another process in order to invoke its methods. • A remote object reference is an identifier for a remote object that is valid throughout a distributed system.
  • 15.
    • Remote objectreferences must be generated in a manner that ensures uniqueness over space and time. • Even if remote object is deleted, it is important that the remote object reference is not reused • Example of unique remote object reference • Concatenate Internet address of its computer and the port number of the process that created it with the time of its creation and a local object number
  • 16.
    • Multicast communicationis appropriate model for communication from one process to a group of other processes. • Four cases where multicast messages are used • Fault tolerance based on replicated services • Discovering services in spontaneous networking • Better performance through replicated data • Propagation of event notifications
  • 17.
    • IP multicast •built on top of the Internet Protocol • allows the sender to transmit a single IP packet to a set of computers that form a multicast group. • multicast group is specified by a Class D Internet address • following details are specific to IPv4 • Multicast routers • Multicast address allocation
  • 18.
    • Multicast Routers •IP packets can be multicast both on a local network and on the wider Internet. • Local multicasts use the multicast capability of the local network, for example, of an Ethernet. • Internet multicasts make use of multicast routers, which forward single datagrams to routers on other networks, where they are again multicast to local members. • To limit the distance of propagation of a multicast datagram, the sender can specify the number of routers it is allowed to pass – called the time to live, or TTL for short.
  • 19.
    • Multicast addressallocation • Class D addresses (that is, addresses in the range 224.0.0.0 to 239.255.255.255) are reserved for multicast traffic • Local network Control Block, Internet Control Block, Ad Hoc Control Block, Administratively Scoped Block • Multicast addresses may be permanent or temporary • When a temporary group is created, it requires a free multicast address to avoid accidental participation in an existing group. • The IP multicast protocol does not directly address this issue. • If used locally, setting the TTL to a small value, making collisions with other groups unlikely.
  • 20.
    • Failure modelfor multicast datagrams • Datagrams multicast over IP multicast have the same failure characteristics as UDP datagrams • Unreliable multicast, because it does not guarantee that a message will be delivered to any member of a group.
  • 22.
    • Failure featureof Unreliable IP Multicast • A datagram sent from one multicast router to another may be lost • recipients may drop the message because its buffer is full. • the effect of the failure semantics of IP multicast on the four examples • Fault tolerance based on replicated services • Discovering services in spontaneous networking • Better performance through replicated data • Propagation of event notifications • The examples also suggest that some applications have strong requirements for ordering, the strictest of which is called totally ordered multicast,