Data Integration Is Key to Realizing the Full Potential of SOA


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Data Integration Is Key to Realizing the Full Potential of SOA

  1. 1. Data Integration Is Key to Realizing the Full Potential of SOA While the benefits of an SOA are clear, early adopters are realizing that underlying data integration is necessary to realize its full potential.
  2. 2. One of the reasons why customers have been slow to adopt and reap the benefits of a service-oriented architecture (SOA), is that services are only as effective as the access they have to information spread out across the enterprise. While standards and technologies have matured to help IT organizations turn application logic into services, not enough effort has been spent to ensure that these services have reliable and consistent access to and knowledge of the underlying data they depend on. This has lead to a lack of re-use, increased development time for services, increased complexity, and higher maintenance costs. The missing piece is a data integration layer that abstracts business logic from the underlying data structures. A data integration layer understands and maintains the location, structure, format, synchronization patterns and cross-reference relationships of the underlying data. It uses this information as a foundation to provide data integration services to reliably access and update the data, greatly increasing productivity and reducing the cost and complexity of building and maintaining business services and thus of implementing an SOA. Furthermore, a data integration layer provides the information and processes to unify data into a single logical view, which can then be stored in an operational data store or data warehouse or accessed via business activity monitoring (BAM) tools for real-time reporting. Responding to Evolving Business Challenges As organizations today face ever-increasing challenges in order to stay competitive, accessibility and accuracy of information have become critical success factors. Better access to information can improve operational efficiency, customer service, marketing effectiveness, and asset and resource management, all directly leading to a positive impact on the bottom line. For example, manufacturing firms can benefit from visibility into their supply chain to shrink order lead times and better manage their inventory. Retail firms can use up-to-the- minute sales data to plan replenishment schedules and eliminate stock-outs. Financial services firms can effectively cross-sell and up-sell additional services based on a single view of the customer. Organizations undergoing merger and acquisition activities can maintain excellent service levels to current and new customers based on a unified view of customer information. D AtA I n t e g r At I O n I S K e y t O r e A l I z I n g F u l l P O t e n t I A l O F S O A
  3. 3. The Data Management Bottleneck Data within most companies resides in different systems and formats. Data is usually stored in a manner that is optimized for use within an individual application with little awareness or regard for how related information is stored in other applications. Usually data from multiple different systems needs to be aggregated and correlated so that it translates into actionable business information and can be consumed by a decision maker. Traditionally the barrier to unlocking data was the connectivity to the system where the data was stored. As integration technologies have matured over the years, basic data access is usually not the problem. But easier access to data has just opened up a new range of problems relating to understanding and managing the data across multiple systems, many of which are controlled by different groups within the IT organization with little collaboration across groups. The data problem within organizations is further compounded by the following factors. • The explosion of data volume within enterprises over the last several years • The ever increasing speed of business that forces organizations to have the information they need at their fingertips or be left behind IT organizations have tried to solve their information management deficiencies with integration software and services. However, this has often been done in a reactive manner, leading to hundreds if not thousands of point-to-point integration interfaces, which can be inflexible and expensive to maintain. Sophisticated IT organizations have taken a top down holistic approach to their environment and have leveraged messaging oriented middleware (MOM) to put in an information backbone that facilitates the flow of information within the organization. The market has taken this one step further with the evolution of service-oriented architecture (SOA) initiatives where data and application logic within the organization are no longer considered the proprietary domain of any one system but rather converted into reusable shared services that cut across systems and can be used as building blocks to compose composite applications that drive the automation of business processes and decision making within a company. While the industry may argue over the exact definition of an SOA and the components that go into an SOA strategy, there is general consensus that this approach, if architected correctly, leads to greater reuse, productivity, interoperability, and a lower cost of ownership for IT. Until now the technology standards for building an SOA that have gained the most traction, like SOAP and WSDL, have focused on building services out of the business logic trapped within the many applications that run an organization’s day- to-day operations. Not enough attention has been paid to the data itself within these applications. As a consequence IT organizations still struggle with making the underlying D AtA I n t e g r At I O n I S K e y t O r e A l I z I n g F u l l P O t e n t I A l O F S O A
  4. 4. data that these services depend on accurate, available and easily understood. Even if the data is accessible through adapters and code there is no clarity on the semantic meaning of data spread out over multiple systems. This impacts the ability to aggregate data into a format that makes business sense. Finally different consumers of services access the same sets of data in a point-to-point manner and do not follow consistent rules for formatting, cleansing and interpreting the data leading to inaccuracies. The Adoption Hurdle – Limitations of Current Implementations Current integration solutions, including most SOA initiatives today, focus on encapsulating the business logic of applications as services. Just as developers of business services and composite applications would like to black box business logic as re-usable services with clearly defined interfaces, these services in turn need to be able to black box access to the underlying data via a data integration layer. While the interaction between service providers and consumers might be loosely coupled based on a messaging backbone, or an enterprise service bus (ESB), the interaction between services and the underlying data remains point to point. For example, a bank may want to provide customer service representatives with customer information to enable them to provide better customer service and take advantage of up-sell and cross-sell opportunities. In order to do so they need access to customer information spread out across multiple systems. They need to build services that return customer information such as purchase history, address and billing information. The IT organization decides to support this initiative by building reusable services called GetCustomerHistory and GetCustomerAddress. Both services deliver different views of customer data and are loosely coupled from the calling application. Data such as purchase history is not necessarily stored in one single place and may need to be aggregated across multiple order systems. If these services were developed in a service composition tool such as TIBCO BusinessWorks™ (or any standards-based service container such as .Net or Java EE), the developer of each service would independently need to understand the structure and relationships of how customer data is stored in the underlying systems to be able to aggregate and cross-reference the data to produce the specific view of information that they require. If they are lucky, the developers of these services would be able to reference data architecture documentation maintained by the IT department as to the schemas and APIs of the underlying systems. Assuming that such documentation is available, and more importantly up to date, they would still need to do the bulk of the data integration development work twice, even though the services D AtA I n t e g r At I O n I S K e y t O r e A l I z I n g F u l l P O t e n t I A l O F S O A
  5. 5. are going after the same superset of information. Furthermore, if there were a change to the structure of the underlying system, those changes would need to be affected in each service. This increases development time, complexity, and maintenance costs of the services and hampers the ability to rapidly reconfigure and assemble services into composite applications, one of the promises of an SOA. Once these services are built, applications and people that need access to customer purchase history and customer address can rely on these services to deliver that information, and the business as a whole benefits. However, if the time and cost for the development and maintenance of such services does not decrease with each additional service then the larger SOA initiative can become cost prohibitive and the benefits and promises of an SOA are lost. Building a Robust Data Integration Layer Ideally each time a service is created that needs access to customer information it does not have to read-write against multiple underlying systems and hence is not forced to understand the format, structure and relationships of how customer data is stored. One solution is to create a data integration layer that insulates the consumers of data from where and how it is stored. This requires IT to first define the standard data model for customers, products, employees, etc. What this means is that regardless of how data is stored within the organization they must determine the logical structure of the data that makes business sense and satisfies all the stakeholders that depend on that data to perform their business functions. For example, IT might determine that the customer data object should have ten attributes including name, address, phone number, account number etc. They must then profile the actual data to understand the physical structure and quality of the data. Most organizations underestimate this task and do not spend enough time and effort to inventory their data assets. Once this has been done comes the painstaking task of mapping the relationship between the way data is actually stored to the logical data model. Building a common data model and then profiling and mapping the physical data itself can seem like a daunting task and paralyze IT departments into inaction. The best approach is to attack this in an incremental manner. IT can pick a business object such as Product or Customer and start from there, or more likely even a subset of a business object. The structure of the physical model, the logical model, and the rules to translate the physical model into the logical model is the metadata that comprises the data integration layer. Knowledge of, and access to, this metadata is key to the reuse and productivity D AtA I n t e g r At I O n I S K e y t O r e A l I z I n g F u l l P O t e n t I A l O F S O A
  6. 6. gains promised by SOA solutions. These rules comprise information about the data location, format, relationships, transformation logic, cleansing rules, and cross-reference relationships to translate the data from multiple disparate systems into the common logical data model. These rules may also include custom business logic that defines how data is handled and massaged internally, for example how a customer’s credit score is calculated. Once such a data integration layer is in place developers only need access to the metadata to build services. The metadata can be maintained in a popular standard such as XML and stored in a repository where it can be securely accessed, versioned, and managed for changes in functionality such as impact analysis. A data integration layer that understands and maintains the location, structure, format, and relationships of the underlying data allows for a loose coupling between the actual data and the services that rely on that data. This architecture greatly reduces the cost and complexity of implementing an SOA and thus should be an essential component of any SOA initiative. An Example of SOA at Work – Operational Data Store To realize the maximum benefit, a data integration layer should go beyond just managing information about the data to provide the means to reliably access and update the data itself. Building the logical data model and managing the associated metadata and rules is the biggest challenge. Once this infrastructure is in place, IT is equipped with the know- how and the tools to access and disseminate information throughout the organization in a format that can be advantageous for the business, for example aggregating a customer’s purchase history across multiple product lines. One popular approach for this information access and sharing is to build an operational data store (ODS). The ODS will store information in the format of the logical data model that is well documented and shared across the IT organization. Business services that need access to information, such as customer data, can access and update the data in the ODS, and thus are insulated from the complexities of the underlying systems. The ODS is in effect a single logically aggregated view of the underlying data. This is similar in many respects to the insights provided by a data warehouse. The fundamental difference is that an ODS is part of the operational landscape and will have a combination of real-time and latent data in it. It can be leveraged for ad-hoc decision making as well as the traditional daily or weekly reports. Because it supports ad hoc decision making, it is often dubbed a real-time data D AtA I n t e g r At I O n I S K e y t O r e A l I z I n g F u l l P O t e n t I A l O F S O A
  7. 7. warehouse. Often an organization’s data warehouse will be one of the data sources that feeds the ODS so that operational data can be enhanced with contextual data for deeper insights. The metadata discussed in the previous section, including data location, format, relationships, transformation logic, cleansing rules, and cross-reference relationships, will drive the creation of data integration services required to build the ODS and keep it in synch with the underlying systems. Different data components of the ODS will have different levels of latency requirements as dictated by business rules. Inventory information may have to be real-time as it changes quite often, but customer address may only need to be refreshed once a day. Customer order history may only need to be refreshed on-demand. The data services platform must be able to understand not only the semantic and structural relationships of the underlying data but the data latency requirements and synchronization patterns necessary to keep the ODS in synch with the underlying systems. The data integration services must be able to deliver on these latency service level agreements between consuming applications and the ODS in the most efficient and optimized manner possible. They must also be able to do so with a certain amount of flexibility as what is batch or near real-time today may need to become real-time tomorrow as business needs evolve. Finally this entire environment must be secured and monitored and managed. The underlying data integration processes must offer high reliability, availability and scalability to support the mission criticality of the services that rely on them. Figure 1. Data Integration in Business level services built with TIBCO an SOA BusinessWorks, Java EE, .Net, etc. ODS Data Integration Layer Application Data Cross Metadata Integration Integration Reference Management Data Integration services synchronize – in real-time or batch Message Bus Data Warehouse D AtA I n t e g r At I O n I S K e y t O r e A l I z I n g F u l l P O t e n t I A l O F S O A
  8. 8. Towards the Agile Enterprise Building a strong data integration layer as the infrastructure for an SOA will help companies reap benefits both in the short and long term. With the explosion of data volumes and system complexities within organizations, this can be quite a daunting task and hence is often swept under the rug. However, when done right it can lead to many technical benefits that in turn directly translate into business benefits that a CEO, CFO or line-of-business leader can appreciate. A data integration layer both enhances the existing benefits of an SOA and leads to new benefits as well. Technical BenefiTs • Leverage existing data assets – An SOA unlocks existing data trapped within siloed applications and data stores thereby extending the life of existing IT investments and reducing the need for costly new investments every time new functionality is required. • Lower development and maintenance costs – Through the reuse of existing assets, such as data integration services and metadata, developers can greatly increase their productivity when deploying new functionality. Furthermore developers can build upon and inter-operate with other services if each adheres to the established common semantic data model. The result is that IT can substantially reduce the risk and costs associated with new functionality development as well as reduce the costs that go into the upkeep and maintenance of existing functionality. • Flexibility to change – An SOA requires consumers and producers of services to be loosely coupled such that they may not even have knowledge of each other. In addition a data integration layer with data integration processes that are metadata driven allows for the loose coupling between services and the data they depend on. This handshake between data and functionality allows for business logic to be rapidly modified as business needs dictate, making the IT organization much more adaptive and flexible to change. D AtA I n t e g r At I O n I S K e y t O r e A l I z I n g F u l l P O t e n t I A l O F S O A
  9. 9. Business BenefiTs • Single version of the truth – A strong data integration layer built on top of a common data model allows data spread out across the organization to be accessed, correlated and aggregated in a consistent manner. Decision makers have access to a single unified view of the information that they can act upon rather than multiple pieces of uncorrelated data each of which tells a different part of the story. • Time to market – An SOA provides the infrastructure for reuse and productivity so that new functionality can be deployed quickly. At the business level this translates into an increased ability to leverage market opportunities by being able to offer customers new products and services in increasingly shortened timeframes. • Better, faster decision making – The data integration infrastructure, when deployed in a manner that supports the principles of an SOA, allows companies to access, share and understand information throughout the organization faster than before and in a manner that allows decision makers not only to view information but act on it as well. • Lower costs – The benefits around reuse, productivity, interoperability and faster development times delivered by an SOA translates into lower IT costs to support existing business processes as well as develop new functionality. This results in direct bottom line savings for the company. D AtA I n t e g r At I O n I S K e y t O r e A l I z I n g F u l l P O t e n t I A l O F S O A
  10. 10. Why TIBCO TIBCO provides a comprehensive solution for building the data integration infrastructure critical to SOA initiatives. All components of the solution are provided through a single integrated platform that allows IT organizations to leverage common design elements, adapters, metadata models, event triggers, administration policies, and best practices for a flexible, low cost solution. Each individual product can work independently within the customer environment or in a tightly integrated fashion with other TIBCO products. Key capabilities required for this solution provided by the TIBCO platform are: • Enterprise metadata management to manage all metadata assets including the logical data model as well as the location, transformation logic, and cleansing rules that map the common data model to the actual system architecture. • Cross-referencing engine to ensure that the same data identified in different ways within different applications is viewed and aggregated as a single entity. For example a single customer may be identified by one ID in a CRM system and by another ID in the order fulfillment system. • Transactional integration to manage transactional synchronization between systems. • Data integration capabilities to manage the integration of large volumes of data in batch, real-time or any combination in between. • Change data capture to detect and move only changes to the data between systems in real time, rather than having to move the entire data set, greatly reducing the need for large, overnight batch windows. • Support for open standards across the board to reduce risk of vendor lock in and reduce cost of ownership for the customer by being able to reuse existing in-house skill sets and resources. • Ubiquitous connectivity to the underlying heterogeneous environment so that data and application logic stored within individual systems can be unlocked, securely accessed, and reused across the enterprise. • Enterprise management and monitoring of data and application integration services for 24x7 mission critical reliability and availability. • Security including payload encryption as well as roles based access and authorization. • Architecture best practices and guidelines based on having successfully deployed SOA initiatives at thousands of customers worldwide. D AtA I n t e g r At I O n I S K e y t O r e A l I z I n g F u l l P O t e n t I A l O F S O A 10
  11. 11. Figure . TIBCO’s Standards- Based Platform for Delivering Event-driven Services Enterprise Integration Services Delivery Status Competitive Bid Customer Outreach in an SOA Partner Data Application Mainframe Integration Integration Integration Integration TIBCO Business Integration Platform Adaptors Metadata Management Management Monitoring Available Inventory New Order Order History TIBCO aims to help organizations meet business objectives and increase success by providing key decision makers with the ability to obtain and utilize accurate and up-to- date information when they need it so they can identify the course of action that provides the best possible outcome for the organization in terms of the organization’s defined measures of success (i.e. revenue, expense, customer satisfaction, etc.), and allow the organization to rapidly change as business dynamics dictate. TIBCO has been in the business of integration for almost two decades. Innovation, thought leadership, and knowledge of the market and customer requirements are what drive research and development efforts for TIBCO products. With over 2,000 customers, TIBCO is the largest independent integration software company. Global Headquarters Tel: +1 650-846-1000 3303 Hillview Avenue Toll Free: 1 800-420-8450 Palo Alto, CA 94304 Fax: +1 650-846-1005 ©2005, TIBCO Software Inc. All rights reserved. TIBCO, the TIBCO logo, The Power of Now, TIBCO BusinessWorks, and TIBCO Software are trademarks or registered trademarks of TIBCO Software Inc. in the United States and/or other countries. All other product and company names and marks mentioned in this document are the property of their respective owners and are mentioned for identification purposes only. 10/05