Extract-Transform-Load (ETL) Market Overview and Directions TDWI Webcast Series June, 13 2007 Mark Madsen http://ThirdNature.net
Course Outline and Overview Components of the data integration market Market data and trends What’s happening now What to expect
ETL: Extract, Transform and Load ETL   Engine The good points Connectivity Transformation Read-write access Metadata The bad points Latency Distributed query limitations Complexity Batch processing focus Database Targets Databases  Documents  Flat Files  XML  Services  ERP  Applications Source Environments
EAI: Enterprise Application Integration Multiple models Multiple implementations Evolution toward open standards-based services and SOA Hub Point Bus
EDR: Enterprise Data Replication EDR Server Order Entry CRM Fulfillment Inventory EDR EDR
EII: Enterprise Information Integration EII Server Consuming Environments Databases  Dashboards  OLAP  Productivity  BAM/BPM  Reporting  ETL  ERP  Applications Databases  Documents  Flat Files  XML  Queues  ERP  Applications Source Environments SQL SOAP WS-* REST File Virtual Models
MDM Processes, systems and technologies for managing master reference data and ensuring consistency across the organization.
MDM: Master Data Management Operational / infrastructure Distribution and/or synchronization of master reference data to ensure consistency in transactions and daily operations More short-term latency and transactional issues Analytical / application Distribution or synchronization of master data to ensure consistent usage for BI purposes More single-definition and long-term tracking issues There are two basic product types with different motivations: Master data registries MDM applications
Data Profiling, Data Quality, Metadata Standalone data profiling, quality and metadata tools have been abandoned by vendors as an application market– it’s all merging into data integration products.
Market data and trends What’s driving the current market and what are the trends?
Data Integration Market Size and Growth Source: IDC
Spending Priorities in IT Great but… Sources: CIO Insight
Vendor Market Share Source: Forrester Research, Inc.
What is the Real Market Share? Sources: Forrester Research, Inc. and TDWI
Diversity of Data Sources Increasing Increased number and format of data sources, countering any gains made by ERP installations. Sources: TDWI, META Group, Inc.
Timeliness of Data Increasingly Important Increased data load frequency Decreasing nightly load windows and more on-demand access Sources: TDWI, Gartner Percentage of Respondents
What’s happening now Consolidation, Extension, Coping
Commoditization of ETL Technology
Market Reflects Different Customer Types
Incremental Product Extension i.e. New features Text Semi-structured data Documents Predictive analytics Search
Finding Other Uses for ETL
Other Uses for ETL One-time Extracts System Migrations System Consolidations Correction / Synchronization
In IT, Data Integration is Still Messy  The history of IT has left us with both application silos  and  integration silos. Current state of practice in IT is to integrate the integration software – it’s worse, not better. 1960s 1970s 1980s 1990s 2000s
Integration Competency Centers Larger organizations are dealing with integration complexity by creating ICCs a centralized group to address integration across systems and projects, rather than dealing with integration project by project in an ad-hoc fashion. Split in the organization depending on where ICC starts from.
What we expect to happen
Shift in Data Integration Focus Features being built into products indicate a shift in focus from data and technical features to process and the data management lifecycle. Traceability Data quality Data governance Master data management
Multiple Integration Technologies in Suites DB API JDBC/ODBC Files Queues JMS SOAP/REST JSR 170 Databases  Documents  Flat Files  XML  Queues  ERP  Legacy Apps Data Quality Data Profiling Metadata Services EDR EII Adapters / Connectors ETL
ETL Product Evolution ETL tools have been growing into suites which are slowly evolving into data integration platforms.  The current state of the art in the ETL market is suites with integrated ETL, metadata, data quality and profiling. The tools are including new sources: mining output, federation, services, EAI, semi-structured data batch files ftp database EAI ETL SOA
ETL Vendor Positioning and Strategy ETL vendor strategies in the data integration market have been shifting. Horizontal  – expand to fill all the different types of integration needs, staying within the information management layers Vertical  – leverage strengths to expand up and down into other layers Niche  – focus on specific technical or vertical market needs for a single technology
Warehouse Architecture: Traditional View SQL Warehouse Database ETL ODS Mart Databases  Documents  Flat Files  XML  Queues  ERP  Applications Source Environments Data Warehouse Clients Dashboards  OLAP  Productivity  BAM/BPM  Reporting  DM  Data Mining
Warehouse Arch Going Forward Databases  Documents  Flat Files  XML  Queues  ERP  Applications Source Environments Data Consumers Databases  Dashboards  OLAP  Productivity  BAM/BPM  Reporting  ETL  Data Mining  Applications SQL Warehouse Database ETL ? ? Mart ODS EDR EII Content Store
Expect that… The big vendors continue to get bigger Commoditization continues and forces price disruption and more consolidation. There will still be new entrants, particularly in low-cost or specialty areas like specific apps or dealing with streaming data Performance will continue to be a concern, but there will be many more options to deal with it. Suites will accrete more stuff, and the split between focused and platform/stack will broaden. Some things won’t catch on like we think.
Creative Commons This work is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/us/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.
Creative Commons Image Attributions The following CC licensed images were used in this presentation: Shopping carts:  http://flickr.com/photo_zoom.gne?id=238070241&size=o

ETL Market Webcast

  • 1.
    Extract-Transform-Load (ETL) MarketOverview and Directions TDWI Webcast Series June, 13 2007 Mark Madsen http://ThirdNature.net
  • 2.
    Course Outline andOverview Components of the data integration market Market data and trends What’s happening now What to expect
  • 3.
    ETL: Extract, Transformand Load ETL Engine The good points Connectivity Transformation Read-write access Metadata The bad points Latency Distributed query limitations Complexity Batch processing focus Database Targets Databases Documents Flat Files XML Services ERP Applications Source Environments
  • 4.
    EAI: Enterprise ApplicationIntegration Multiple models Multiple implementations Evolution toward open standards-based services and SOA Hub Point Bus
  • 5.
    EDR: Enterprise DataReplication EDR Server Order Entry CRM Fulfillment Inventory EDR EDR
  • 6.
    EII: Enterprise InformationIntegration EII Server Consuming Environments Databases Dashboards OLAP Productivity BAM/BPM Reporting ETL ERP Applications Databases Documents Flat Files XML Queues ERP Applications Source Environments SQL SOAP WS-* REST File Virtual Models
  • 7.
    MDM Processes, systemsand technologies for managing master reference data and ensuring consistency across the organization.
  • 8.
    MDM: Master DataManagement Operational / infrastructure Distribution and/or synchronization of master reference data to ensure consistency in transactions and daily operations More short-term latency and transactional issues Analytical / application Distribution or synchronization of master data to ensure consistent usage for BI purposes More single-definition and long-term tracking issues There are two basic product types with different motivations: Master data registries MDM applications
  • 9.
    Data Profiling, DataQuality, Metadata Standalone data profiling, quality and metadata tools have been abandoned by vendors as an application market– it’s all merging into data integration products.
  • 10.
    Market data andtrends What’s driving the current market and what are the trends?
  • 11.
    Data Integration MarketSize and Growth Source: IDC
  • 12.
    Spending Priorities inIT Great but… Sources: CIO Insight
  • 13.
    Vendor Market ShareSource: Forrester Research, Inc.
  • 14.
    What is theReal Market Share? Sources: Forrester Research, Inc. and TDWI
  • 15.
    Diversity of DataSources Increasing Increased number and format of data sources, countering any gains made by ERP installations. Sources: TDWI, META Group, Inc.
  • 16.
    Timeliness of DataIncreasingly Important Increased data load frequency Decreasing nightly load windows and more on-demand access Sources: TDWI, Gartner Percentage of Respondents
  • 17.
    What’s happening nowConsolidation, Extension, Coping
  • 18.
  • 19.
  • 20.
    Incremental Product Extensioni.e. New features Text Semi-structured data Documents Predictive analytics Search
  • 21.
  • 22.
    Other Uses forETL One-time Extracts System Migrations System Consolidations Correction / Synchronization
  • 23.
    In IT, DataIntegration is Still Messy The history of IT has left us with both application silos and integration silos. Current state of practice in IT is to integrate the integration software – it’s worse, not better. 1960s 1970s 1980s 1990s 2000s
  • 24.
    Integration Competency CentersLarger organizations are dealing with integration complexity by creating ICCs a centralized group to address integration across systems and projects, rather than dealing with integration project by project in an ad-hoc fashion. Split in the organization depending on where ICC starts from.
  • 25.
    What we expectto happen
  • 26.
    Shift in DataIntegration Focus Features being built into products indicate a shift in focus from data and technical features to process and the data management lifecycle. Traceability Data quality Data governance Master data management
  • 27.
    Multiple Integration Technologiesin Suites DB API JDBC/ODBC Files Queues JMS SOAP/REST JSR 170 Databases Documents Flat Files XML Queues ERP Legacy Apps Data Quality Data Profiling Metadata Services EDR EII Adapters / Connectors ETL
  • 28.
    ETL Product EvolutionETL tools have been growing into suites which are slowly evolving into data integration platforms. The current state of the art in the ETL market is suites with integrated ETL, metadata, data quality and profiling. The tools are including new sources: mining output, federation, services, EAI, semi-structured data batch files ftp database EAI ETL SOA
  • 29.
    ETL Vendor Positioningand Strategy ETL vendor strategies in the data integration market have been shifting. Horizontal – expand to fill all the different types of integration needs, staying within the information management layers Vertical – leverage strengths to expand up and down into other layers Niche – focus on specific technical or vertical market needs for a single technology
  • 30.
    Warehouse Architecture: TraditionalView SQL Warehouse Database ETL ODS Mart Databases Documents Flat Files XML Queues ERP Applications Source Environments Data Warehouse Clients Dashboards OLAP Productivity BAM/BPM Reporting DM Data Mining
  • 31.
    Warehouse Arch GoingForward Databases Documents Flat Files XML Queues ERP Applications Source Environments Data Consumers Databases Dashboards OLAP Productivity BAM/BPM Reporting ETL Data Mining Applications SQL Warehouse Database ETL ? ? Mart ODS EDR EII Content Store
  • 32.
    Expect that… Thebig vendors continue to get bigger Commoditization continues and forces price disruption and more consolidation. There will still be new entrants, particularly in low-cost or specialty areas like specific apps or dealing with streaming data Performance will continue to be a concern, but there will be many more options to deal with it. Suites will accrete more stuff, and the split between focused and platform/stack will broaden. Some things won’t catch on like we think.
  • 33.
    Creative Commons Thiswork is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/us/ or send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.
  • 34.
    Creative Commons ImageAttributions The following CC licensed images were used in this presentation: Shopping carts: http://flickr.com/photo_zoom.gne?id=238070241&size=o