Data Federation


Published on

This briefing provides a high level overview regarding the implications and efficacy of data federation.

Published in: Business, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Data Federation

    1. 1. Semantech Inc. 2008 - Lecture Series Federated Data Architecture (Logical Data Integration) Presented by Stephen Lahanas Principal Consultant, Semantech Inc. Feb 14th, 2008 Copyright 2008, Semantech Inc. – All Rights Reserved
    2. 2. <ul><li>Presentation Objectives </li></ul><ul><ul><li>To define what is Federated Data Architecture. </li></ul></ul><ul><ul><li>To highlight current best practice solutions that match this architectural approach. </li></ul></ul><ul><ul><li>To Illustrate the concepts in the context of a real world case study. </li></ul></ul><ul><ul><li>To illustrate concept/s in context with federal and commercial sector IT modernization efforts. </li></ul></ul>I Copyright 2008, Semantech Inc. – All Rights Reserved
    3. 4. <ul><li>Characteristics of Data Federation </li></ul><ul><ul><li>It is a design philosophy. </li></ul></ul><ul><ul><li>It is a data architecture ‘pattern.’ </li></ul></ul><ul><ul><li>It is an integration approach. </li></ul></ul><ul><ul><li>It can be merged within a Lifecycle Methodology. </li></ul></ul><ul><ul><li>It is user-centric. </li></ul></ul><ul><ul><li>It is flexible. </li></ul></ul><ul><ul><li>It is rapid – both in development & system response. </li></ul></ul><ul><ul><li>It is focused on capability rather than technical orthodoxy. </li></ul></ul><ul><ul><li>It is designed for performance & pragmatism. </li></ul></ul>Copyright 2008, Semantech Inc. – All Rights Reserved
    4. 5. <ul><li>A Definition of Data Federation </li></ul><ul><ul><li>Data Federation represents a pragmatic solution for ‘loosely coupled’ enterprise integration and perhaps more importantly enterprise and multi-enterprise interoperability. Federated data architectures support the exploitation of multiple disparate authoritative data sources within the context of a logically integrated or orchestrated view of the enterprise. Federated architectures are by definition data fusion solutions. </li></ul></ul>Copyright 2008, Semantech Inc. – All Rights Reserved
    5. 6. <ul><li>Best Practices & Architectural Principles </li></ul><ul><ul><li>User involvement & Ontology Automation </li></ul></ul><ul><ul><li>Metadata Orchestration Layer </li></ul></ul><ul><ul><li>Data Performance Engineering </li></ul></ul><ul><ul><li>Identity Federation & Multi-level Security </li></ul></ul><ul><ul><li>SOA Framework Exploitation (Enterprise Service Bus) </li></ul></ul><ul><ul><li>Common Messaging Exchange (format) Exploitation </li></ul></ul><ul><ul><li>Data Governance Framework (patterned upon federated coordination of authoritative systems as opposed to the configuration of one, single source data repository). </li></ul></ul>Copyright 2008, Semantech Inc. – All Rights Reserved
    6. 8. “ Federated data orchestration is already a best practice on the Internet, in nature and in the management of other knowledge resources. The exponential proliferation of data over the next two decades will make single source / warehouse solutions even less practical.” Copyright 2008, Semantech Inc. – All Rights Reserved
    7. 9. Copyright 2008, Semantech Inc. – All Rights Reserved <ul><li>How does this approach impact the enterprise? </li></ul><ul><ul><li>Data It reduces the risk involved in “Big Bang” data warehouse / datamart implementations. </li></ul></ul><ul><ul><li>It reduces the risk involved in “Big Bang” data warehouse / datamart implementations. </li></ul></ul><ul><ul><li>It allows for more pragmatic transitions of legacy capability to modernized solutions. </li></ul></ul><ul><ul><li>It helps to maintain vital knowledge capital associated with the system /data experts. This extra time can then be used to ensure that follow-on systems, services or consolidations move forward without losing corporate knowledge. </li></ul></ul><ul><ul><li>It allows for more flexibility in the face of complex integration / interoperability scenarios or budgetary constraints. </li></ul></ul><ul><ul><li>It helps to maintain a close connection to current data systems end-users. </li></ul></ul>
    8. 11. <ul><li>As with any new approach, there is some confusion… </li></ul><ul><ul><li>“ Data Federation doesn’t support enterprise data standardization.” Yes and no. In most cases large scale data strictly governance solutions have failed. Why? Because they lacked flexibility, speed and user input. Data federation gives us the chance to move quickly but also begin the holistic definition process with end user feedback. </li></ul></ul><ul><ul><li>“ Data Federation” doesn’t lend itself to optimal DBMS performance (query response). Not true, logical integrations of federated sources perform far better than data warehouse counterparts. Why, the optimization is ABSTRACTED from the source data. This flexibility allows value add to Authoritative Data Sources (ADSs) w/o redesign to all systems. The design occurs once in one place in an metadata driven optimization layer. </li></ul></ul>Copyright 2008, Semantech Inc. – All Rights Reserved
    9. 12. Case Study Part 1: (USAF Financial System) Subject: A global enterprise, comprised of multiple semi-autonomous divisions; The United States Air Force. Problem: Manage an operating budget of nearly $40 billion affecting the programs, administration and combat operations of personnel and facilities worldwide. Not only is the amount staggering, but consider that these dollars support more than 400,000 civilian, military and contractor employees working around the clock in more than 100 global locations. The need was immediate – capability was expected within a year. Copyright 2008, Semantech Inc. – All Rights Reserved
    10. 13. <ul><li>Case Study Part 2 </li></ul><ul><li>The Architectural Challenge </li></ul><ul><li>U.S. Air Force needed to integrate financial data from disparate systems (more than 20), and provide geographically dispersed financial managers with the tools to manage day-to-day operations. </li></ul><ul><ul><ul><li>Senior leaders at all levels required a real-time snapshot </li></ul></ul></ul><ul><ul><ul><li>A Financial data management system capable of scaling initially from a few hundred users to more than 15,000 users </li></ul></ul></ul><ul><li>System had to be operational 24/7 in every time zone worldwide </li></ul><ul><li>System users at all levels needed instant access in order to make timely decisions </li></ul><ul><li>With limited resources, the USAF needed to keep development & deployment costs for any solution to an absolute minimum </li></ul>Copyright 2008, Semantech Inc. – All Rights Reserved
    11. 14. <ul><li>Case Study Part 3 </li></ul><ul><li>Realization – Only an Agile Architecture would work… </li></ul><ul><li>This solution would push the limits of typical data warehousing solutions. </li></ul><ul><li>Users needed ad-hoc control of their queries. A few canned, stock reports would be insufficient. </li></ul><ul><li>The system’s security model required unprecedented access controls since it dealt with highly sensitive financial data. </li></ul><ul><li>Data had to be available for analysis within minutes to worldwide users, and the system required scalability without information flow disruption. </li></ul>Copyright 2008, Semantech Inc. – All Rights Reserved
    12. 15. Case Study Part 4 How does this exploit Federated Data Orchestration? The solution allows for the continued operation of legacy authoritative systems while they are being replaced / modernized / migrated to the AF ERP solutions (DEAMS & ECSS). This represents federated source data. The solution deploys a layer of federated data and metadata for optimization in a centralized location – thus mirroring the source environment without performance constraints of directly sourcing data from the BI layer. True federation occurs at a minimum of three tiers as the solution interacts with domain data from across the USAF, this represents “logical integration.” Copyright 2008, Semantech Inc. – All Rights Reserved
    13. 16. <ul><li>Case Study Part 5 </li></ul><ul><li>Current Federated Architecture Solution Performance </li></ul><ul><li>Averaging more than 600,000+ ad hoc queries per month </li></ul><ul><li>3 TB - will double over the next 3 years </li></ul><ul><li>2 Billion+ rows of data </li></ul><ul><li>15,000+ users world-wide </li></ul><ul><li>8.29M queries in FY06 – 95% ad hoc </li></ul><ul><li>1.2M queries in September ’06 </li></ul><ul><li>Ad hoc results: 80% in 10 seconds or less </li></ul><ul><li>99.7% up time, “follow the sun” globally </li></ul><ul><li>At least 3 other systems were developed in attempts to replace this solution (attempts to conform with formal data warehouse strategies), none of them met user expectations for accuracy, query capability or performance. </li></ul>Copyright 2008, Semantech Inc. – All Rights Reserved
    14. 17. <ul><li>Separate Organizations – ‘Federated Workflows’ </li></ul><ul><ul><li>Many enterprises consist of multiple organizations providing data or services to one another, in separate uncoordinated workflows culminating in eventual Business Intelligence reports. </li></ul></ul><ul><ul><li>One of the possible goals of any data architecture modernization endeavor is to illustrate and coordinate all of the workflows related to the eventual provision of enterprise-wide analytics. </li></ul></ul><ul><ul><li>Federated workflows can be enabled through federated but coordinated data sources. </li></ul></ul>Copyright 2008, Semantech Inc. – All Rights Reserved
    15. 18. By definition, any Common Operating Picture (COP) represents a fusion of disparate or federated sources. It is designed more for real-time awareness and dynamic analytics. The optimization cache can be used for historical reference as well – the key is not trying to replacing everything else with this solution all at once… Copyright 2008, Semantech Inc. – All Rights Reserved
    16. 19. The reality is that nearly all integration within and across SOA environments is based upon data exchange and will ultimately be demonstrated through data exploitation. The data architecture, business architecture and services all must be logically orchestrated for SOA to work as expected. Copyright 2008, Semantech Inc. – All Rights Reserved
    17. 20. <ul><li>Conclusion </li></ul><ul><ul><li>Federated Data Architecture is more than merely avoiding the use of traditional Data Warehouse techniques. It represents the next generation approach to sophisticated data integration and involves a variety of tools and techniques. </li></ul></ul><ul><ul><li>Most importantly though, it provides a more rapid and pragmatic way to solve highly complex enterprise data issues. Enterprises can realistically expect to deploy comprehensive solutions in months, not years… </li></ul></ul>I Copyright 2008, Semantech Inc. – All Rights Reserved