SOA and Data Exchange Standards Recommendations


Published on

1 Comment
1 Like
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

SOA and Data Exchange Standards Recommendations

  1. 1. Industry Standards for C2 Common Core Data Model & Framework Table of Contents 1. Introduction 2. Categories of Net-Centric Use Cases 3. XML Processing and Schemas 4. Metadata Tagging 5. Data Models and XML Representations 6. Data Transport 7. Enterprise Web of Services 8. Rich Internet Applications 9. Relationships among Standards 10. Conclusion and Recommendations 11. References 12. Appendix - Summary of Standards Information 13. Attachment - Web Services Protocol Functional Collection 14. Attachment - Instant Messaging Protocol Functional Collection 1. Introduction The objective of this report is to "identify current and future specific standards and technologies used by, and projected to be used by industry providers of IT for the DOD C2 community" as requested in the draft CRADA Execution Plan. The report is divided into several sections based on the initial input from JFCOM. The standards discussed in this report include metadata tagging, data models, data transport, messaging, and rich Internet applications. (Enterprise Web Services, XML schemas and Instant Messaging are discussed in detail in NCOIC Protocol Function Collection documents that are attached to this report.) The relationships among the standards are illustrated in an architecture diagram. A conclusion presents some recommendations for future standards-based activities. All of this information is summarized in a tabular form in an appendix. A key issue is the role of the simpler standards (REST, RELAX NG, XML messaging, standard message formats) vs. more complex standards (WS-*, XML Schema, messaging, common data models and semantics). Another important question is when to use the run-time discovery and composition of services and messages as opposed to design-time static implementations. A categorization of the types of net-Centric use cases can be very useful in helping to make these decisions. 2. Categories of Net-Centric Use Cases Net-Centric architectures can be divided into three general categories based on the relationships of participating nodes. In an analogy with earlier classifications, these categories can be described as Intranet-Centric, Extranet-Centric and Internet-Centric. This categorization should be considered as guidelines for recommending standards decisions as opposed to a rigid classification. The essential definitions and scope are:
  2. 2. Intranet-Centric = Activities within a group that has common operational standards (e.g. data models, middleware, processes, platforms). Recommended standards and technology (e.g. Enterprise Service Bus) in this area are chosen by management based on specific organizational requirements and standards. Extranet-Centric = Collaboration across groups with different operational standards but common interoperability standards (e.g. message formats, communication protocols, collaborative processes) based on negotiation. This is the area where the NCOIC recommended interoperability standardizations can have the biggest impact Internet-Centric = Ad hoc interactions among diverse nodes and users based on public standards (e.g. basic Web standards, network protocols, standard data representations) which may not be deployed to all participants. This approach can also be used for rapid development of non-mission critical applications within organizations. Within the government application domain, there is increasing interest for emergency and stability operations which should be addressed by the NCOIC. Table 1 maps some key concepts to Intranet-Centric, Extranet-Centric and Internet- Centric categories. Concepts Intranet-Centric Extranet-Centric Internet-Centric Business analogs Internal Enterprise Business to business Business to public Examples of Use Single organization Joint operations Emergency and Cases operations stability operations Implementation Mandated Negotiated Partial and de facto of standards Coupling Tighter coupling Looser coupling Ad hoc coupling between systems Communities of Communities of Cross COI External non-COI users interest (COI) collaboration collaborators Governance Centralized pre- Contractual Dynamic policies defined policies agreements enforced User Individuals users User organizations Dynamic users and authentication known in advance known in advance organizations Middleware for Enterprise Service SOAP Web Services, Simple Web services Bus Service Gateways services e.g. REST Service Design-time Deployment and Run-time Discovery Configuration Composition and More static and More dynamic based Ad hoc and less mashups controlled on standards maintainable Data sharing Common databases, Standard data models Public databases and data models, and and message formats self describing data semantics for data exchange in messages XML Metadata Internal standards Shared standards Public standards Table 1 Mapping of concepts to use case categories
  3. 3. Figure 1 shows the relationship and interfaces between Intranet-Centric, Extranet-Centric and Internet-Centric Standards Figure 1 Relationship among use case categories and standards 3. XML Processing, Schemas and Data Binding Structured markup enables multiple Internet-Centric standard approaches for parsing of XML documents including Document Object Model (DOM) [1], the Simple API for XML (SAX) [2] , and the Streaming API for XML (StAX) [3]. DOM is the W3C standard tree-based model for XML documents that is easier to access through an API. However building and storing large trees can be costly in terms of processing time and memory use. The Simple API for XML (SAX) is an API for XML parsing that uses an event-driven stream-based approach that avoids the need to build a tree structure. The Streaming API for XML (StAX) is a Java API for pull-parsing of XML documents. Data schemas are templates for describing instances of data content (e.g. documents, messages). There are several templates that can be used to describe XML data content including the XML Schema [4] definition developed by the Worldwide Web Consortium and Relax NG [5] from OASIS which is now an ISO standard. XML Schema is a W3C standard XML specification for defining the contents and structure of XML documents RELAX NG is a simpler schema than W3C XML Schema that can also be used for describing XML documents [6].
  4. 4. XML data binding [7] is the extraction of the content of XML documents to data objects The Java API for XML Binding (JAXB) [8] is a standard from the Java Community Process that uses XML Schemas to create bindings. Standards for other programming languages are currently Intranet-Centric (i.e. organization specific) Using a programming language API based on a schema is usually more efficient than parsing the full document. XSLT [9] is a language for transforming XML documents into a tree structure e.g. to create a new XML document. The transformation is achieved by matching patterns with elements in the document. When a pattern is matched, a template is instantiated as part of the target tree. 4. Metadata Tagging XML schemas and tags can be used to give initial structure to data but it is not sufficient for semantic interoperability. There are several standards that can be used to provide more extensive information about data content to facilitate interoperability. For structured data, standards include Resource Description Framework (RDF) [10], Web Ontology Language (OWL) [11] and Microformats [12]. There is an emerging standard for unstructured data called Unstructured Information Management Architecture (UIMA). The standards for the usage of specific metadata tags can be Intranet-Centric (e.g. internally mandated), Extranet-Centric (e.g. cross-organizational agreements) or Internet- Centric (e.g. public standards) Microformats are markup components embedded in HTML documents. They provide a simple ways to provide information in standardized formats. Internet-Centric examples include hCard for contact information and hCalendar for scheduling events. Microformats could add value in Intranet-Centric and Extranet-Centric applications. The Resource Description Framework (RDF) is a W3C standard for adding semantic content to XHTML pages and XML documents. It is based on triplets in subject- predicate-object form that can be expressed in XML syntax. The subject and predicate are resources that are identified by Universal Resource Identifiers (URIs). The object can be a resource or a Unicode string. RDF Schema is an RDF vocabulary for describing properties and classes of RDF resources including an inheritance semantics. The Web Ontology Language (OWL) OWL extends RDF Schema by adding more vocabulary for describing relations between classes (e.g. disjointness), cardinality (e.g. "exactly one"), equality, richer typing of properties, characteristics of properties (e.g. symmetry), and enumerated classes. Unstructured Information Management Architecture (UIMA) [13] is a framework and SDK for developing such applications being standardized by OASIS. UIMA applications input plain text and extract entities and related information. The framework supports the decomposition of these applications into components and manages the collaboration among components. The components must support the framework APIs and provide
  5. 5. metadata in XML descriptor files. There is an Apache implementation of UIMA under development. 5. Data Models and XML Representations Standard data models are used in many industries to enable Extranet-Centric collaboration across diverse companies. Some examples include ACORD [14] for insurance, HL-7 [15] for healthcare, and FIX [16] for finance. In the C2 space the Command and Control Information Exchange Data Model (C2IEDM) [17] and its extension, Joint Consultation Command & Control Information Exchange Data Model (JC3IEDM) [18] provide standardized data model endorsed by many organizations. The NATO Multilateral Interoperability Programme (MIP) [19] is leading the standardization effort for C2IEDM and JC3IEDM. There are also Intranet-Centric data models in many defense organizations. Extranet-Centric industry ontologies have been defined in some cases to ensure semantic interoperability. IBM has made available Industry Ontology Packs for its customers. Work is underway to add a semantic layer on top of C2IEDM. Intranet-Centric master databases are used to rationalize data across multiple systems. Standardized data models can be used as the schemas for master databases. Legacy data models can be mapped to the standardized model by adaptors. The initial application in many industries is Customer Data Integration (CDI). CDI enables data to be shared across diverse multiple applications. A similar approach has been used in demonstrations of C2 interoperability using C2IEDM as the schema for a master database. Standardized structured message formats can enable information exchange. There are many examples from industry (e.g. EDI [20]) and military applications (e.g. VMF [21]). Basing the message format on a standard data model or ontology enables more robust interoperability. The Coalition Battle Management Language (C-BML) [22] is a new messaging format for C2 Information Exchange being built on top of JC3IEDM. XML representations can be used for the message formats. Several industries have adopted standardized XML formats e.g. XBRL [23] for business reporting. Extensible Battle Management Language (XBML) [24] is a prototype that uses XML representations of C2IEDM data and Web Services interfaces for information exchange. 6. Data Transport There are many messaging products that can be used for data transmission. There are several standards available for messaging. For Java applications, the Java Message Service (JMS) [25] provides a standard application programming interface to messaging middleware. JMS support message queuing and publish/subscribe messaging.
  6. 6. A new standard for near real-time structured data transport is the Data Distribution Service (DDS) API [26] from the Object Management Group. DDS supports QoS-based publish/subscribe messaging based on pre-defined topics. There is a standard DDS Interoperability Protocol [27] that enables different implementations of DDS to work together. The Extranet-Centric Message Exchange Mechanism (MEM) [28] and event-driven Data Exchange Mechanism (DEM) [28] have been defined by the Multilateral Interoperability Programme (MIP) that also developed C2IEDM. The use of standard XML data representations for information exchange is increasing. Textual XML formats can be very verbose compared to binary messages. Efficient XML [29] is a newly adopted standard from the Worldwide Web Consortium (W3C) that supports XML compression that is competitive with existing binary formats. Instant Messaging (IM) is a commonly used collaboration tool for users to exchange information. There are several products in this area that have been adopted as Intranet- Centric standards within organizations. The Extensible Messaging and Presence Protocol (XMPP) [30] is an IETF XML-based instant messaging standard that has also been endorsed by NCOIC's Building Blocks team. The Session Initiation Protocol (SIP) [31] is an IETF standard for initiating and managing sessions that can work with XMPP. The SIP Protocol for Instant Messaging & Presence Leveraging (SIMPLE) is an IETF standard that overlaps XMPP and is also included in the NCOIC IM PFC. [32] The Real- time Transport Protocol is the IETF standard for transmitting non-XML multimedia data that can be used in conjunction with XMPP and SIP. [33] 7. Enterprise Web of Services One of the key issues for the future of distributed architectures is the role of SOAP-based Web Services [34] vs. services based on the original Web architectural style called Representational State Transfer (REST). [35] SOAP-based Web Services use stacks of standards (WS-*) that have been developed by the W3C and OASIS. [36] This architecture has similarities to earlier middleware approaches in that many capabilities are made available at the cost of greater complexity an increased difficulty in maintaining interoperability. The REST approach is simpler and more scalable but has fewer built-in standard services. [37] In general, SOAP-based services are useful for Extranet-Centric architectures where robust supporting services and security are required. REST is often used for Internet- Centric services to take advantage of increased simplicity and Web scalability capabilities (e.g. caching). There is Web Application Definition Language (WADL) [38] under development for REST services that is an attempt to add descriptive information similar to WSDL for SOAP Services. The NCOIC Interoperability Framework (NIF) group created a Web Services Protocol Function Collection (PFC) with detailed descriptions and recommendations on Web
  7. 7. Services and REST architectures. The PFC document is attached to this report and is the NCOIC-endorsed approach. .In Intranet-Centric architectures, Enterprise Service Buses (ESBs) [38] and existing non- HTTP middleware are frequently deployed for increased functionality and performance In general, there are no Extranet-Centric standards beyond Web Services for interoperability across ESBs. However there is a Service Data Objects (SDO) [39] standard under development by OASIS that provides a data graph format for data exchange across distributed applications. Data Access Services (DAS) [40] is a related standard for mapping data from multiple sources (e.g. databases, XML) bi-directionally to SDO. The Business Process Execution Language (BPEL) [41] is a declarative XML-based language for specifying executable composite processes. BPEL can interoperate with ESBs and Web Services. 8. Rich Internet Applications Recently there has been a rapid increase in Web-based applications with richer user interfaces than traditional browser application. This has been made possible by the use of JavaScript, page content descriptions (e.g. XML), and partial updating of sections of Web pages. This technical approach has been given the name Asynchronous JavaScript and XML (Ajax) [42]. There are many toolkits that support Ajax. A new industry group called the OpenAjax Alliance [43] is developing standards to enable interoperability across these toolkits. There are several other technologies that can be used to support rich interface applications. One example is Web data feeds such as RSS [44] and Atom [45]. RSS is a family of standards for distributing data that is posted to subscribed Web sites. RSS content is displayed through a reader that regularly checks sites that for new content and downloading updates. Atom is a newer Web feed alternative to the RSS family that has been endorsed by the IETF. Ajax, Web Feeds and othe data sources (e.g. Google Maps) can be combined to rapidly create Rich Internet Applications. This approach is often called "mashups"[46]. Mashups and services compositions can also be used together to create distributed applications on top of existing capabilities. Typically mashups have been employed with Internet-Centric standards (e.g. REST services). There are currently no standards for combining mashups and Extranet-Centric Web Services. 9. Relationship among Standards Figure 2 shows the architectural relationships among standards
  8. 8. Figure 2 Relationships among Standards Figure 2 show the role of the individual standards in an end-to-end implementation. The implementation layers and some associated standards are: Web Client Applications - Rich Interface Application Web Message Bus - XML Messaging Enterprise Service Manager - SOAP servers Composites - Business Process Execution Language (BPEL) Enterprise Service Bus - Java Message Service (JMS) Enterprise Data and Systems - Service Data Objects (SDO), Data Access Services (DAS) Gateway - XSLT transformations External Networks- Efficient XML, Data Distribution Services (DDS) 10. Conclusion and Recommendations {TBD based on future discussions} 11. Appendix -Summary of Standards Information Table 2 summarizes standards information from this document and the NCOIC Web Services Protocol Functional Collection Document.
  9. 9. Standard or Description and Status and Relationships to Recommendations Technology Purpose Future Plans Other Standards SOAP was SOAP is a protocol SOAP is a W3C SOAP is the foundation SOAP-based Web initially the for XML messaging standard and is used of the Web Services services should comply Simple Object that is used for in many Web stack with Simple Object Access Protocol remote procedure Services deployments Access Protocol calls and one-way (SOAP) 1.1 W3C Note, messaging dated 8 May 2000. Web Services WSDL is an XML- WSDL is a W3C WSDL is used to SOAP-based Web Definition based service standard. WSDL 2.0 specify SOAP services should comply Language description for web is the latest version interfaces with Web Services (WSDL) services. The WSDL Description Language defines services as WSDL is used in (WSDL) 1.1 W3C collections of many Web Services Note, dated 15 March network endpoints, or applications 2001. ports WS- WS-I has several WS-I is an industry Additional WS-I SOAP-based Web Interoperability profiles consisting of group whose Profiles include the services should comply (WS-I)) Basic sets of Web Services. members include the Basic Security Profile with WS-I Basic Profile Profile The Basic Profile 1.2 leading software and the Reliable Secure Version 1.1, dated 10 includes SOAP 1.1, companies including Profile April 2006 and other WSDL 1.1 and IBM, Microsoft, WS-I profiles UDDI v2 BEA, Oracle etc. SOAP-Based SOAP is a W3C There are many WS-* SOAP-based Web Web Services standard. The Web standards built on top of services should comply stacks Services SOAP that are under with WS-I Basic Profile Interoperability development at the Version 1.1, dated 10 Organization (WS-I) W3C and OASIS April 2006. Extranet- publishes profiles of Centric collaborators interoperable should agree on standards supported standards Representational REST is a set of REST is based on Ajax and mashup Use RESTful services State Transfer architectural existing W3C applications can be for large-scale Internet- (REST) principles that are an standards built on top of REST Centric deployments alternative to the Services SOAP WS-* There are many RESTful Web services approach large-scale REST is sometimes should track all implementation of proposed as a simpler necessary session state REST-based services alternative to WS-* with the client and not Web Services the server. XML Schema XML Schema is a XML Schema is a There are several W3C XML Schema, as specification for W3C standard alternatives to XML specified in XML templates that can be Schema including Schema Part 1: used to validate XML Schema is used DTDs and Relax-NG Structures Second conforming XML in many data-Centric Edition W3C data content/ XML XML applications Recommendation, dated Schema Datatypes 28 October 2004 and (XSD)define the data XML Schema Part 2: types that can be used Datatypes Second in XML content Edition W3C Recommendation, dated 28 October 2004
  10. 10. RELAX NG Schema language for RELAX NG is an Simpler schema RELAX NG Schema, XML OASIS standard language alternative to as specified by the XML Schema RELAX NG OASIS Specification dated 3 December 2001 Web Application A machine process- Specification has WADL is an XML- Track development of Description able description of been stable since Nov based language. WADL WADL to see if it Language HTTP-based Web 2006 but has not yet depends on W3C XML becomes a standard. (WADL) applications. WADL been submitted to a Schema or RELAX describes the related standards body. NG for describing available resources, XML message formats their supported Future Plans depends and uses XHTML for inputs and outputs, on feedback from embedding human- and links between implementers. readable resources Alternatives documentation include development within a WADL of a new version of description. the specification and/ or submission to a standards body. Microformats Microformats are The Web site Microformats make it Use de facto industry standardized markup is easier to create standard microformats. embedded within coordinating mashups Consider the possible HTML. Semantics information definition of additional can be embedded and dissemination about microformats targeted encoded using microformats at domain-specific HTML attributes content such as rel, rev, and Many Web sites are class. Examples of using microformats microformats include hCard for contact information and hCalendar for events. Extensible Style XSLT is designed to XSLT is a W3C XSLT uses XPath for Evaluate the recently Sheet Language translate XML standard currently in parsing through XML released XSLT 20 (Jan Transformation documents into other version 2.0 documents 2007) (XSLT) XML documents or other formats. XSLT XALAN-Java is an uses templates, open source pattern matching and implementation of style sheets. XSLT Business Process BPEL provides a BPEL is an OASIS BPEL can interoperate Careful evaluation is Execution declarative format for Standard with Web Service recommended before Language (BPEL) describing executable standards. using BPEL to ensure composite processes that it meets needs Service Data Service Data Objects Service Data Objects Service Component Should be evaluated in Objects (SDO) is a data graph format has been submitted to Architecture uses SDO cases (e.g. Extranet- for representing data. OASIS for approval as a preferred data Centric) where multiple Read and write transport diverse applications access is available Apache Tuscany is an have to exchange data from multiple open source languages implementations . Multiple vendor
  11. 11. versions Data Access Data Access Services Data Access Services SDO uses Data Access Should be used to Services (DAS) provide bidirectional has been submitted to Services to interface to provide the interface to mappings between OASIS for approval data sources data stores when SDO SDO and multiple is the data format for data sources (e.g. Apache Tuscany is an data exchange XML, relational data open source bases) implementation. Multiple vendor versions are available. Efficient XML Efficient XML is a Efficient XML is a Can be used to reduce Should be evaluated for specification for an W3C standard the size of XML transmitting XML data encoding format that messages that are used over networks with allows efficient AgileDelta developed by many industry and constrained bandwidth interchange of the the Efficient XML domain-specific (e.g. mobile networks) XML Information format. They sell an standards Set. implementation including API interfaces to and from applications Data Distribution Specified API for Data Distribution DDS can be interfaced Should be used for data Services (DDS) data-Centric publish- Services is an OMG to ESBs distribution on and-subscribe standard. networks where near messaging for real- DDS can use Efficient real-time delivery and time distributed New DDS XML compression to QoS guarantees are systems Interoperability support XML data important Protocol standard distribution considerations Support subscription will enable by topics, QoS, and interoperability data objects between DDS implementations DDS products are available from RTI and Prismtech Unstructured Defines standard The DARPA - Aligned with SE UIMA seems to be the Information interfaces for Sponsored UIMA standards/tools like leader framework for Management processing Working Group was OMG/MOF/UML, generating metadata Architecture unstructured formed by DARPA aligned with XML and creating workflow (UIMA) information and IBM in January standards (XML processes for (automatically 2005 to facilitate Schema and XMI) unstructured. However generating meta-data cooperation in it should be evaluated for text, image, and advancing the UIMA The UIMA platform carefully for audio as part of architecture. independent spec allow functionality in specific distributed for different transports. domains workflows) and IBM makes UIMA To provide immediate declarative available as a free value it uses WSDL to mechanisms for SDK and makes the define SOAP bindings describing the core Java framework for the general analytic specific behaviors of available as open interfaces.. these interfaces. source software
  12. 12. Joint Consultation JC3IEDM is intended The program is JC3IEDM is a The scope of JC2IEDM Command & to represent the core managed by the successor to C2IEDM. is an organizational Control of the data identified Multilateral It is being used to decision for Intranet- Information for exchange across Interoperability develop new battle Centric use cases. For Exchange Data multiple functional Programme (MIP) management languages Extranet-Centric Model areas and multiple information exchange (JC3IEDM) views of the JC3IEDM seems to be a requirements. preferred data model Resource RDF provides a RDF is a W3C RDF is the basis for The eventual impact of Description lightweight ontology standard other Semantic Web RDF and Semantic Web Framework (RDF) system to support the standards technologies on exchange of industries is still not knowledge on the clear. Web. Web Ontology OWL provides a OWL is a W3C OWL is used to define The eventual impact of Language (OWL) ontology language standard an Ontology for RDF and Semantic Web that can to describe Services called OWL-S technologies on data content industries is still not clear. Extensible A protocol for instant Internet Engineering Interfaces have been Recommended in Messaging and messaging and Task Force (IETF) created between SOAP NCOIC IM PFC Presence Protocol presence that was Standard and XMPP (XMPP) originally developed to support Jabber Six open source instant messaging implementation product available. OpenAjax Hub OpenAjax Hub Twelve toolkits OpenAjax Registry Use standards (Ajax was initially enables Ajax toolkits demonstrated under development supported by the Asynchronous to interoperate. interoperability at OpenAjax Alliance JavaScript and Features include a 2007 Interopfest which is supported by XML ) library manager, all major software publish/subscribe Reference vendors event hub, and implementation at load/unload event SourceForge notification RSS RSS is a family of The RSS 1.* family RSS content is The RSS 2.* family has standards for Web is supported by a specified using XML more support and is the feeds of published RSS-DEV Working preferred choice for data to RSS reader Group RSS Readers can be podcasting software on using Ajax subscribing clients The RSS 2.* family is supported by an RSS Advisory Board Atom Atom is a newer Web Atom is a an IETF Atom uses multiple Atom has many feed and publishing standard XML standards such as enhanced features over standard XML schema and RSS and is supported namespaces by a major standards group. Atom Readers can be built using Ajax Table 2 Summary of standards information 12. References
  13. 13. [1] Document Object Model (DOM) [2] Simple API for XML (StAX)) [3] Streaming API for XML (SAX)) [4] XML Schema [5] RELAX NG [6] XML Schema vs Relax NG [7]Data Binding [8] XSLT [9] Java API for XML Binding [JAXB] [10] Resource Description Framework (RDF) [11] Web Ontology Language (OWL) [12] Microformats [13]Unstructured Information Management Architecture (UIMA) [14] ACORD [15] HL-7 [16] FIX [17] C2IEDM [18] JC3IEDM [19] MIP Programme
  14. 14. [20] EDI [21] VMF [22] JC3IEDM and Coalition Battle Management Language (C-BML) [23] XBRL [24] XBML [25] JMS [26] DDS [27] DDS Interoperability protocol [28]MIP Message Exchange [29] Efficient XML [30] XMPP [31] SIP [32] SIMPLE [33] XMPP, SIP, and RTP in the Real time Internet [34] SOAP [35] REST [36] Emerging Standards for SOA Seminar [37] Combining REST and SOAP Web Services [38] Web Application Description Language (WADL)
  15. 15. [39] Enterprise Service Bus [40] SDO [41] DAS [42] BPE L [43] Ajax [44] OpenAjax Alliance [45] RSS [46] Atom [47] Mashups