Usage Landscape of Enterprise Open Source Data Integration


Published on

Talend Document Download
Download: Usage Landscape of Enterprise Open Source Data Integration

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Usage Landscape of Enterprise Open Source Data Integration

  1. 1. WHITE PAPER Usage Landscape Enterprise Open Source Data Integration
  2. 2. Table of Contents Introduction ................................................................... 3 Background ....................................................................3 Diverse Data Integration Projects..........................................4 Data Integration Needs and Tools..........................................6 Open Source Data Integration vs. Proprietary Solutions ...............8 Enterprise Requirements .................................................... 9 Community Support ........................................................ 10 Community Involvement................................................... 11 Conclusion ................................................................... 13
  3. 3. Talend White Paper Usage Landscape - Enterprise Open Source Data Integration Introduction Enterprise data integration needs are growing exponentially over time, as is the interest in open source technologies and the adoption of open source solutions. With this in mind Talend conducted a survey to define the usage landscape of open source data integration and to profile users of this technology. The data used in this analysis was collected from 1013 survey participants. Responses came primarily from the U.S. (56.5%), followed by Europe (35.2%), with the rest of the responses (8.3%) originating in the rest of the World. 8% 35% 57% US Europe Other Survey respondents’ demographics Background As companies merge, acquire new applications, and build their IT platforms by incorporating disparate applications with legacy systems, information systems are becoming more and more heterogeneous. As a result, data integration tools are now indispensable if enterprise IT departments are to properly manage the flows of data across the information system. Page 3 of 13
  4. 4. Talend White Paper Usage Landscape - Enterprise Open Source Data Integration In addition, alternative models of software deployment—such as Software as a Service (SaaS)—and the need for interoperability with partners, customers, providers, etc., all have an important impact on data integration requirements. The global economy is imposing cost controls on IT Managers, both in Data Integration The process of combining data residing terms of staff and software, at a time when data integration at different sources and providing the user with a unified view of these data. represents an increasingly larger percentage of the enterprise IT budget. Asked to do more with less, IT personnel would be better off spending cycles on tasks other than the time consuming manual scripting needed to meet custom requirements. In fact, software resources with lower acquisition and operation costs would allow IT Managers to more easily deploy enterprise-grade solutions.   In this context, open source solutions offer a very compelling argument. Open source tools can automate and maintain tasks formerly requiring manual scripts, and the existing skills of the IT implementation team easily transfer to an open source offering. In addition, IT departments don’t have to justify significant up-front fees. Diverse Data Integration Projects Data integration is the collective term for technologies that include ETL (Extract-Transform-Load) for business intelligence and data warehousing, and operation data integration—the flows of data across operational applications and systems. These needs can range from high throughput batch transfers of data to near-real-time, trickle-feed data flows. Project Type Consistent with the global data integration market distribution— whether open source or proprietary—most of the survey participants (61.5%) use open source solutions for their ETL projects, in Page 4 of 13
  5. 5. Talend White Paper Usage Landscape - Enterprise Open Source Data Integration particular for BI, Data warehousing and analytics. This can be attributed to the fact that ETL is the most mature segment of the entire data integration market. ETL Data Loading Operational Data Integration: Batch Migration Operational Data Integration: Real Time Database Synchronization 0% 10% 20% 30% 40% 50% 60% 70%   Types of projects for which open source data integration is used Data Loading Data loading (41.9%) and data migration (26.5%) are the second and The process of loading data in an application or database—for example fourth most popular type of project. Both of these are good prior to its deployment. candidates for open source solutions, as they are typically one-offs, Data Migration The process of transferring data with no ongoing purpose that would justify a long-term investment between databases, applications or other systems, with the purpose of in an expensive proprietary tool. replacing a system with another. Data Synchronization Data synchronization (19.1%) is also a popular type of project The process of establishing data consistency on remote sources conducted by open source data integration users. continually harmonizing the data over time. Batch vs. Real-Time Operational data integration—whether batch or real-time—is also a good fit for open source solutions. As business tempos speed up, real-time and nearly real-time operational data integration projects will prevail over bulk transfer projects. As of the date of the survey, 40% of participants used open source tools to manage their batch operational data integration tasks, compared to only 22.9% for real- time projects—but the latter is a much faster growing segment. Page 5 of 13
  6. 6. Talend White Paper Usage Landscape - Enterprise Open Source Data Integration ETL vs. Operational Data Integration Taken together, batch and real-time operational data integration projects (62.9%) are slightly better represented than ETL usage share (61.5%), even though the former market segment is less mature. And, if we also add in data synchronization, the operational project share reaches 82%. The reason for this over-representation is simply that open source tools are particularly appropriate for operational projects because they meet a number of data integration requirements, whereas—traditionally—proprietary tools focus on ETL. In addition, enterprises that want to diversify their data integration tools are often discouraged by the licensing costs of proprietary applications. Open source solutions offer a greater breadth of connectivity and more flexibility in terms of adoption, deployment, and maintenance. Data Integration Needs and Tools Although software companies are trying to provide unified integration solution packages, the data integration needs for most enterprises are so complex that they often need to multiply the number and nature of the integration software products they use. Manual scripting Database utilities Commercial software 0% 10% 20% 30% 40% 50% 60%   Data integration technologies used in conjunction with open source Page 6 of 13
  7. 7. Talend White Paper Usage Landscape - Enterprise Open Source Data Integration Survey participants proved to use a combination of commercial applications, open source solutions, and database utilities to meet their data integration needs. The statistics show that using open source and commercial solutions in combination is very common (31.2%), and that the two can, and do, coexist on the same platform. In fact, open source solutions are often complementary to an existing proprietary solution that—for whatever reason—cannot address a specific need. In some cases it may be that it’s not worth the expense of investing in a proprietary solution extension. The high incidence of database utilities shown in the survey results (53.9%) is as expected—these utilities are a no-cost solution and are usually included with the databases. Their usefulness, however, is limited to dedicated database usage. Applications are often stacked as needs arise—increasing connectivity issues—whether enterprises want their CRM system to communicate with their ERP module, or to have their disparate databases exchanging information with their home-grown platform. Faced with multiple connectivity issues, enterprises often have no option other than manual scripting to keep data flowing across their heterogeneous enterprise systems. This is why the survey results rank manual scripting as one of the technologies most frequently invoked (54.7%) by enterprises to meet their integration needs. Although this is much higher than commercial (31.2%) packaged technologies, it is not surprising that manual scripting is the solution of choice as it carries the lowest initial cost. Although manual scripting is often intended to be a short-term fix for interchange issues, once in production it often becomes a permanent solution. And, in the end, this simple stop-gap can Page 7 of 13
  8. 8. Talend White Paper Usage Landscape - Enterprise Open Source Data Integration become an entire home-grown platform. The drawback of hand coding or home-grown platforms surfaces over time in the inevitable maintenance problems that increase the TCO. The advantage, however, is that it fits a particular need that none of the available commercial or open source solutions can meet. Open Source Data Integration vs. Proprietary Solutions In an ongoing effort to lower their data integration software TCO, many enterprises are now considering open source solutions, not just for one-time projects, but also for their ongoing mission-critical processes, to replace or complement their expensive CPU- dependent solutions. Ease of use Performance Avoid lock-in No licensing costs Source code access 0% 20% 40% 60% 80% 100% Very important Important Neutral Not important Decision criteria Open source solutions are a real alternative to the proprietary world. Key players have made major strides toward improving the usability and friendliness of open source technologies, traditionally a weak spot for these applications. In just a few short years, open source has evolved from something “geeky” into an enterprise-ready solution. Today, open source solutions are sufficiently feature-rich to meet complex user requirements. The survey results reflect these expectations. Page 8 of 13
  9. 9. Talend White Paper Usage Landscape - Enterprise Open Source Data Integration Respondents felt most strongly about ease-of-use (59%) and performance (53.9%) as the most important aspects of an open source data integration solution. Surprisingly, licensing cost is not the gating criterion for enterprises turning to open source solutions. It actually comes fourth after performance, ease of use, and no lock-in (42.5%), with only 42.1% of respondents considering it very important. Access to the source code comes last on most priority lists when enterprises are choosing open source tools. It is a common misconception that control of the source code is important for users of open source software. Most users today understand that open source solutions are as mature as their proprietary counterparts and, therefore, don’t feel the need to enhance the code themselves. Today, open source solutions are advantageously replacing the source code escrow of proprietary software. However, few enterprises want to allocate in-house resources (or even have the expertise) to edit, enhance, and maintain their data integration applications code. Enterprise Requirements An analysis of the survey data indicates that users expect the same performance and enterprise-scale features from open source solutions that they previously found only in proprietary products. In order of importance these features include: • centralized scheduling and execution dashboard • shared repository • administration tools Page 9 of 13
  10. 10. Talend White Paper Usage Landscape - Enterprise Open Source Data Integration 70% 60% 50% 40% 30% 20% Scheduling tool Dashboard 10% Shared repository 0% Administration tool   Enterprise open source data integration requirements First, 60.5% of respondents want a scheduling tool that lets them consolidate and centralize their technical processes. Second, 57.8% users need a dashboard to centrally monitor processes as they execute. Because enterprise users often work in teams and need to share data on large-scale projects, 54.9% consider a shared repository essential. Finally, 38.4% of enterprise users want an administration tool to centrally manage users and projects. However, not all companies have enterprise-scale requirements. Single users and SMBs might not need that sort of enterprise-grade feature. What emerges is that open source solutions address diverse needs for a variety of user profiles, whether large or small. Community Support As shown, enterprises want the same support with open source solutions that commercial applications provide. The major difference lies in the fact that a significant number of open source users (84.9%) would rather call on the community for help addressing issues than get support from a dedicated service. This lets them reduce the cost of support and decrease their data integration budget; the return they get from the community is Page 10 of 13
  11. 11. Talend White Paper Usage Landscape - Enterprise Open Source Data Integration comparable in quality to traditional support from a proprietary vendor. Community support (forums, etc.) Email-based or Web-based support Guaranteed response times Phone support 0% 20% 40% 60% 80% 100% Community vs. commercial support expectations Open source users value the forum and the other community tools at their disposal, as well as the ease-of-mind that comes from knowing that there is no pressure to upgrade or to buy new tools. The community also tends to be more responsive than traditional support services and community tools are no-cost to the enterprise. However, enterprise users working on mission critical projects, do need (and demand) vendor-provided, enterprise-grade technical support. This still represents a minority of the total number of users of open source data integration (20.9%), but is a fast growing proportion. Community Involvement Two-thirds of the respondents say that they are willing to actively participate in the community, and nearly half are ready to help beta-test open source products. Open source communities have a real, live QA lab of thousands at their disposal. Open source users appreciate getting support from the community and feel at ease in sharing their experiences and helping other users solve problems. Getting involved in the community ensures the sustainability of the Page 11 of 13
  12. 12. Talend White Paper Usage Landscape - Enterprise Open Source Data Integration open source arena and, by extension, the sustainability and the quality of the application they use. 80% 60% 40% 20% Forum Beta testing 0% Code contributions Expectation for community contributions Other community tools—like bug/feature tracking systems—are also broadly used by the community, especially for feature requests. Because the development cycle of open source applications is usually quite short, users know that the chances of getting a feature request developed and made available in the next release of an open source application is significantly greater than a similar request in the proprietary domain. It’s a win-win situation. Community enterprises are asked to Beta-test and report bugs on features that they requested previously, ensuring both quick access to these features and the quality of the developed application. In addition, participating in the community is much less time- consuming than getting involved in the development itself. Only 10.4% of users want to contribute to code development. A closer look at this group indicates that most of them want to contribute external features—such as connectors—rather than core code. Page 12 of 13
  13. 13. Talend White Paper Usage Landscape - Enterprise Open Source Data Integration Conclusion The results of the survey clearly show that open source data integration solutions are mature enough for mission-critical enterprise use in every arena and, in most areas, open source is as powerful as its proprietary counterparts. Open source products are stable and continually evolving to meet market requirements. Their total cost of ownership is significantly better than proprietary solutions and users confirm the ease of use and performance of these products. Open source data integration is indeed enterprise ready.     © 2009 Talend. All rights reserved. Page 13 of 13