Data Best Practice Guide

398 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
398
On SlideShare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
8
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Data Best Practice Guide

  1. 1. EBSOA Best Practices Data Standards This document was prepared by Integrated Software Specialists, Inc. (“ISS”) and is to be considered confidential and proprietary to ISS and Iowa Department of Administrative Services.
  2. 2. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 Copyright 2006 by Integrated Software Specialists (“ISS”), Inc, and the State of Iowa Department of Administrative Services (“DAS”). The copyright to these materials and any accompanying software is owned, without reservation, by ISS and The State of Iowa DAS. These materials may not be copied in whole or part without the express written permission of ISS and The State of Iowa DAS. iOpen™, iOpen ABie™, iOpen ABef™, iOpen ABes™, iOpen ABin™, iInspect™, iDetect™, and SureStart™ are some of the trademarks owned by ISS. Other trademarks and trade names in this documentation are owned by other companies and are used for product and company identification and information only. All rights reserved. ISS is the owner of other registered and unregistered trademarks. The above list is not exhaustive. Contact Us Integrated Software Specialists, Inc. 1901 N. Roselle Road Suite 450 Schaumburg, IL. 60195 1-888-477-0001 If you have suggestions for this publication, send an e-mail message to: docs@issintl.com. Your message is answered by our ISS Documentation Group. Visit our home page at http://www.issintl.com for more information about ISS products and services. CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 2
  3. 3. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 Table of Contents INTRODUCTION ................................................................................................................................5 1.1 Purpose.............................................................................................................................................5 1.2 Scope................................................................................................................................................5 2 DATA CHALLENGES.......................................................................................................................6 2.1 Message Data...................................................................................................................................6 2.2 Accessing Data in documents and repositories................................................................................7 3 DATA BEST PRACTICES................................................................................................................8 3.1 XML Data representation................................................................................................................8 3.1.1 Positioning XML in Your Architecture......................................................................................8 3.1.2 Focus on Extensibility and Reuse..............................................................................................8 3.1.3 Define Documents Judiciously...................................................................................................9 3.1.4 Naming Element-types...............................................................................................................9 3.1.5 Apply XML consistently.............................................................................................................9 3.1.6 Choose the Right API for XML Data Manipulation..................................................................9 3.1.7 Securing XML Documents.......................................................................................................10 3.1.8 Pick the Right XML Tools........................................................................................................11 3.2 XML Data Validation....................................................................................................................11 3.2.1 XSD Schemas or DTDs............................................................................................................11 3.2.2 Positioning Validation in Your Architecture...........................................................................12 3.3 XML Schema Administration........................................................................................................12 3.3.1 Formalize Schema Maintenance Procedures..........................................................................12 3.3.2 Metadata Management............................................................................................................13 3.4 XML Transformation.....................................................................................................................13 3.4.1 Positioning XSLT in your architecture....................................................................................13 3.4.2 Create dynamic style sheets.....................................................................................................14 3.4.3 When Not to Use XSLT............................................................................................................14 3.5 XML Data Querying......................................................................................................................14 CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 3
  4. 4. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 3.5.1 Positioning XQuery in your architecture.................................................................................14 3.5.2 Establish a data policy management layer..............................................................................14 3.5.3 Unify document and data.........................................................................................................14 3.6 Common Domain Model...............................................................................................................15 CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 4
  5. 5. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 Introduction 1.1 PURPOSE This guide provides you with a high level overview of Data Best Practices for Service-Oriented Architecture (SOA) solutions. SOA solutions use XML-based standards to represent, validate, transform, and query data. This guide is also aimed at architects and developers who examine the data aspects of the Web services they produce. 1.2 SCOPE Service-Orientation Architecture represents the next major trend in enterprise computing. Adapting to SOA requires new perspectives, techniques, and tools for implementing technology solutions to meet business expectations. The best practices described in this document only touch upon Web Services Data and XML data. Other SOA-related topics such as security standards, service modeling, Service-Oriented Integration, Web services network infrastructure or platform infrastructure are covered in other ISS EBSOA best practice guides. CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 5
  6. 6. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 2 Data Challenges An SOA consists of a collection of services that communicate with each other. The communication can be a simple exchange of data or it could involve two or more services coordinating some activity or process. When considering data in an SOA, one must address: • XML Data in messages between sender and receiver • Accessing XML data in data stores and documents Service A Service B Service B Service A Figure 2-1. Services in a Service-Oriented Architecture. 2.1 MESSAGE DATA In an SOA, all communication between services is message-based. The Simple Object Access Protocol (SOAP) specification defines a standard message format used to pass messages in an SOA. As shown in Figure 2-2, a SOAP message contains an envelope, header, and body. The body of the message contains the payload. CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 6
  7. 7. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 Soap Envelope Header Body Figure 2-2. Structure of a SOAP Message. Processing messages is a major service activity within SOA. The following represents common steps for processing a message: • Data validation • Data transformation • Content-based routing of the message 2.2 ACCESSING DATA IN DOCUMENTS AND REPOSITORIES Once data is represented by XML, it can be stored in XML documents and repositories. Subsequently, these documents and repositories can be queried and maintained. CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 7
  8. 8. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 3 Data Best Practices 3.1 XML DATA REPRESENTATION XML as a data representation technology is a core part of integration solutions. This section covers how to best integrate XML data representation into your application environment. 3.1.1 Positioning XML in Your Architecture XML has many uses within application and integration architectures. Incorporated strategically, XML offers a more flexible architecture. For instance, XML may be used as a data transport format within applications. Alternately, XML may be used as a: • Transport format between applications • Data bridge within applications • Data bridge between applications 3.1.2 Focus on Extensibility and Reuse With any application, business and data models change over time. Building a flexible document structure that can be extended and reused is ideal for future enhancements. IT Management needs to be conscious of the following when building new applications: • Use generic element-type names. Avoid incorporating agency, department or project name as they may change overtime. • Create a series of modular XML schemas to partition large document structures into reusable substructures. • Do not hardcode name space references. • Avoid or limit use of recursive elements (i.e. elements that are nested within themselves). For Example, for your service you may have the need to have an envelope element that contains letter elements and envelopes elements. In this case, the envelope element is a recursive element (i.e. an envelope contains an envelope). If elements are nested within themselves, there is no way to control the number of levels. • Incorporate the any XSD schema element to preserve future extensibility. The element type ANY from DTDs allows only elements that have been declared in the DTD while the element type "anyType" from XML Schema allows any kind of element inside it (even elements that were not declared inside the Schema). CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 8
  9. 9. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 3.1.3 Define Documents Judiciously When modeling and designing XML documents for an application, the modeler and designer make a number of decisions impacting application security and performance. In XML-enabled applications, structured data constantly flows through application layers, thus keeping XML documents trim. Avoid designing extraneous data fields that are not used by the application. XML data cannot be easily appended to an XML document the way data is appended to a delimited flat file. Invest in a more serious modeling effort upfront to minimize the chances that a document structure having to be altered in the future XML documents are stored in a hierarchical manner known as trees. Although XML documents can easily provide traditional one-to-one and one-to-many relationships between data elements, they will not be as accommodating with many-to-many relationships. 3.1.4 Naming Element-types XML allows for fully customized documents to be modeled. While it may be tempting to create documents with field names that completely describe the field use, there are trades off to performance if you do this. Consider modeling your documents for processing efficiency by keeping the names of the elements likely to be repeated to a minimal length. To support maintenance of documents, add descriptive comment directly to each document. 3.1.5 Apply XML consistently In order to ensure the ease of solution maintenance, XML must be applied in a standard manner within solutions. Following the selection of an XML standard for document processing and delivery approach, do not deviate from the approach unless necessary. Define standard data flow processing for XML documents (e.g. always perform the validation and transformation steps in the same processing sequence each time), and establish standards that outline how APIs will process XML input and output. 3.1.6 Choose the Right API for XML Data Manipulation In today's IT world, XML is the popular medium for exchanging data. To leverage XML data, the sender and receiver must understand the importance of how to manipulate data. Currently, the most popular APIs for manipulating XML documents are DOM and SAX. Both provide structure-centric interfaces into an XML document which may become burdensome when working with static document models. Data binding represents another method for manipulating XML documents. Alternative approach for accessing XML data is via data binding API’s. Data binding API’s offer data centric interfaces into data bodies that are automatically mapped into XML docments. CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 9
  10. 10. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 The following summarizes each XML API: API Features DOM SAX (or Steaming API) DATA Binding High Memory usage “Lightweight” ‘Lightweight’ Uses trees for reading, Read only sequential access Maps XML Schema to classes/ modifying, and writing to structured data structs and back access to structured data SAX may be used to create and read portions of a DOM Tree Here are some guidelines for choosing an XML API: Use DOM: • For small to medium size documents • When modifying the document structure at run-time • When your application requires immediate access to preloaded XML document information, without performing file I/O Use SAX: • For large documents • For documents maintaining a limited amount of nested elements • When DOM is too slow • When needing to use only a fraction of the total document data Use Data Binding APIs: • When working with static XML documents • When requiring a class-oriented interface to XML documents • To simplify data access programming logic 3.1.7 Securing XML Documents XML formatted data can travel through various tiers of a typical SOA solution and can even venture beyond your internal environment into the outside world. XML Security best practices are addressed in a separate document, the “EBSOA Web Services Security” document. CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 10
  11. 11. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 3.1.8 Pick the Right XML Tools Choose XML development authoring tools carefully. A product can support XML in many ways. To the degree this support is relevant and useful to your development effort is something you must investigate. Many tools auto generate mark-up and code or generate proprietary mark-up and code, which may unnecessarily bloat the size of the document, schema, and style sheet files. A key feature to look for when assessing XML tools is the feedback provided by an authoring tool. A useful tool is one that provides resultive feedback that may help you to troubleshoot and resolve code issues. 3.2 XML DATA VALIDATION 3.2.1 XSD Schemas or DTDs DTDs, originally created for SGML and used with HTML and XML, represent an established, traditional approach to XML data validation. The SML Schema Definition Language (XSD) is a recent innovation that has matured and gained industry acceptance. XSD schemas offer features allowing for the creation and validation of complex XML documents. Compared to XSD, the features offered by the DTD language are severely limited. Reasons to use DTDs over XSDs include: • If the structure of your XML documents is relatively simple, and you are confident it will not change, DTDs provide an easy-to-use syntax that can efficiently be incorporated into an application. • DTD files tend to be smaller in size than XSD schemas. If your validation code needs to be transmitted across distributed environments, DTDs may provide a lightweight, bandwidth- friendly alternative. • If proprietary tools or products are part of your fixed application environment, they may not yet support XSD schemas. If integration with these products is a project requirement, you may be forced to work with DTDs. • Since DTDs have been around much longer, it is reasonable to think that some of your existing developers will already have some expertise. This may be a factor considering that the learning curve imposed by the XML Schema Definition Language is significant. Reasons to choose XSD over DTDs include: • XSD schemas are the future. If you are starting from scratch, with no predispositions or prerequisites, the better choice for building your application is XSD schemas. CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 11
  12. 12. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 • If the structure of your documents is complex and can benefit from multiple data type support as well as highly customizable validation rules, then XSD schemas will be required. • XSD schemas can facilitate complicated data representation requirements, and are more suited for tighter integration with relational databases. • Only XSD schemas are supported natively by the Simple Object Access Protocol (SOAP). Because SOAP is the primary messaging protocol use by Web Services, this is a critical consideration. • XSD schemas have for more extensibility options than DTDs. • Better data-binding libraries build on top of XSD, rather than DTDs. In summary, XSDs are a good choice for most organizations that are just getting started with XML schemas. 3.2.2 Positioning Validation in Your Architecture Data Validation ensures data integrity received in messages and extracted from repositories. DTDs and XSDs define a document layout utilized to validate XML document structure and content. However, both DTD and XSD have limitations. If you rely solely on either DTDs or XSDs for validation, you should be aware that both offer limited or no support for: • Conditional constraints • Inter-element dependencies • Cross-document Validation • Null values for attributes • Validation of large numeric numbers 3.3 XML SCHEMA ADMINISTRATION 3.3.1 Formalize Schema Maintenance Procedures Ad hoc schema creation is a common problem in many organizations. To address the issue, effort is required to establish a schema maintenance process. The suggested approach for establishing a maintenance process is as follows: • Assign ownership of each schema to an individual or group • Standardize the tool-set for schema development and maintenance • Standardize XML schemas CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 12
  13. 13. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 • Create an application review process that ensures schema adheres to standards • Create a schema development and maintenance process • Communicate the process, standards, and technology to development groups • Version Control your schemas and associated metadata 3.3.2 Metadata Management Organizations are increasingly concerned with the need to manage metadata-the information that describes the structure and content of their corporate data. Properly managed metadata enables rapid, even automatic, development of new application interfaces. Without it, interface development incurs costly, handcrafted code, while enterprise agility through Service-Oriented Architectures is nearly impossible to achieve. The move to Java 2 Platform, Enterprise Edition and .NET service-oriented architectures require organizations to understand model-based business processes and data architectures. Simultaneously, developers need work processes and tools that reuse and manage interrelated metadata across suites and environments. The established approach to metadata management is based on ISO 11179, a six-part International Standard for metadata registries. It describes a metadata registry, how to classify data, and how to store and manage descriptions of data. It assumes the established entity-relationship model that is associated with relational databases, the traditional data storage paradigm that is expected to continue to dominate for at least the next decade. Its basic data classification unit is the data element. Data elements can readily be identified in bottom-up fashion from enterprise database schemas and documentation. So, where there is a customer database with a “name” field, “customer name” is a data element. This is a straightforward way of capturing and managing metadata for existing and new applications within an enterprise. 3.4 XML TRANSFORMATION Being able to transform XML documents into multiple output formats has become a key part of contemporary XML architectures. XSLT has become the foremost technology to supply this important function. Cascading style sheets may be used in some cases. 3.4.1 Positioning XSLT in your architecture XSLT offers a variety of value within applications and integration architectures. XSLT style sheets are commonly utilized with an N-tier architecture as translation mechanisms for document and repository based information. XSLT may be used as a transformation technology within application. XSLT may also be used as a presentation unifier within a single application. CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 13
  14. 14. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 3.4.2 Create dynamic style sheets Like other XML documents, XSLT style sheets may be created on the fly at run-time. This is useful when style sheets rely on business rules that require access to outside data sources. Also, consider using Cascading Style Sheets (CSS) for some transformations. If formatting information needs to travel with an XML document, consider using CSS. 3.4.3 When Not to Use XSLT XSLT processes documents by loading the entire document into memory before processing. This may lead to performance issues in enterprise applications, in which you may need an alternative to XSLT such as an enterprise transformation engine or a custom transformation solution built on XPath or SAX. 3.5 XML DATA QUERYING Once data is represented by XML it conforms to a standard format that allows it to be searched and manipulated using a standard query language. The XQuery language is a technology for the standardized querying of XML data. 3.5.1 Positioning XQuery in your architecture XQuery may be used in multiple ways within applications and integration architectures. XQuery can be used within an application to unify data from multiple sources. In some environments, XQuery can be used to return a single result from data sources that cross applications. 3.5.2 Establish a data policy management layer When XQuery is properly architected to connect to multiple repositories within an enterprise, it becomes a powerful data access mechanism. Strategically placing between applications and multiple data repositories allows for the creation of policies that can be applied to one layer, but are enforced across all related applications and repositories. 3.5.3 Unify document and data Standardizing on XML as an application data representation technology establishes a consistent format that represents data residing in corporate repositories. XQuery can be used to issue sophisticated searches against XML data repositories. In addition, if XML documents are created and stored, then they too may be searched. XQuery may be used to search data repositories and documents and combine both in a results set. CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 14
  15. 15. ISS BEST PRACTICE GUIDE DATA STANDARDS 7/28/06 VERSION 1.1 3.6 COMMON DOMAIN MODEL Integration solutions involve some business applications running on disparate systems that are interconnected by a maze of connections and protocols. Therefore, every solution is different. If you have several applications supporting the same domain, then creating a common domain model provides a common language for all applications to use when communicating with each other. When a new application is added to the domain, automatically it will be able to integrate and exchange data with existing applications, provided it uses the same common domain model. A common language also creates a set of familiar terms and processes (or shared knowledge capital) that makes the human interaction and discussion involved in managing the domain much simpler. A common domain model provides a level of indirection between data formats. The incoming messages are transformed using a model that is common between all applications. Then, the common domain model is transformed in order to be consumed by the destination application. The advantage of this is that you can add a new source application or a new destination application just adding new message transformations. If your Web Services interface does not inherently support a common domain model, then it needs to make use of an adapter to convert outgoing requests to the correct common domain model before placing a message for delivery to its target application. The data type for the schema of the messages for the common domain model is also important. A data dictionary should also be created that describes each field in each XML schema within the domain model. It should specify what each field means, the values it should reflect within different scenarios, the data types it supports, as well as any other helpful data that someone might require in order to map their proprietary XML schemas onto the common data format. CONFIDENTIAL ©2010 INTEGRATED SOFTWARE SPECIALISTS, INC. PAGE 15

×