Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ETL with WSO2 Enterprise Middleware Platform


Published on

Published in: Technology
  • Be the first to comment

ETL with WSO2 Enterprise Middleware Platform

  1. 1. ETL with WSO2 Enterprise Middleware Platform Prabath Abeysekara - Associate Technical Lead
  2. 2. Outline ● A Classic Use Case ● What’s ETL and How It Is Interpreted In The Modern World? ● Why ETL? ● Challenges In Implementing ETL Solutions ● Why Traditional Standalone ETL Products Are Considered Dead In The Modern World? ● What Factors To Be Considered When Implementing ETL In Re-Architecting A System?
  3. 3. Outline contd.. ● Impact Of Tooling ● Reference Architecture ○ How to build an “efficient, robust, scalable, auditable, performing and maintainable” ETL solution with WSO2 EMP? ● Demo - Data Mapping With WSO2 Developer Studio ● Summary ● Q&A
  4. 4. A Classic Use Case - Financial Sector Flat files Financial Reporting RDBMS ETL Process Enterprise Data Warehouse Revenue Predictions XML, Web Services Other Analytics & BI fronts
  5. 5. What’s ETL? - Traditional Interpretation ● Extract ● Transform ● Load
  6. 6. What’s ETL? - Modern Interpretation ● Extract ● Monitor ● Profile/Audit ● Analyze ● Cleanse ● Transform ● Load
  7. 7. Why ETL? ● ● Generally, to build and maintain data repositories with “single version of the truth” out of the multiple heterogenous data sources scattered across an organization or a business domain. Then, the business users can use that data for, ○ Predictive Analysis ○ Revenue predictions and comparisons ○ Monitor Overall Growth of an organization ○ Business Policies ○ Strategic Decisions
  8. 8. Challenges ● Data definition establishment ● Need for expert knowledge ● Scalability and Performance ● Business user acceptance and seamless support for wide range of business use cases ● Maintenance, Data Archival ● Real-time or Near Real-time data synchronization
  9. 9. Why Standalone ETL Products Are Dead? ● ● ● ● Modern day organizations are evolving as it’s never been before. Tendency to adopt architecture patterns such as SOA to reduce IT costs and have flexible business processes is rapidly increasing. Organizations are more focussed towards “Connected businesses”. Thus, it’s very likely that an organization might have a IT infrastructure in place already.
  10. 10. Why Standalone ETL Products Are Dead? ● ● ● ● Adopting a standalone ETL product? Possible, but worthwhile? Generally less support for open standards. Extension points? Connectors? More custom code! Usually, relies on some proprietary data integration patterns, inducing high maintenance costs. Additional licensing costs, need for separate expert/operational assistance, again inducing high maintenance costs.
  11. 11. Why Standalone ETL Products Are Dead? ● Tendency to use in-house re-usable business components leveraging the benefits of SOA ● Less operational costs ● Scalability is a main focus nowadays. ● Having a similar process implemented enables, horizontal scalability at different layers as the need arises.
  12. 12. Re-Architecting A System’s DIL? ● ● Data Integration is always cumbersome Need for ensuring policy compliance of data at its target containers. (usually Enterprise Data Warehouses, Central MDM repositories, etc) ● Flexibility ● Ensuring acceptable Performance ● What about Reliability?
  13. 13. Re-Architecting A System’s DIL? ● How to deal with the freshness of data? ● When to synchronize? ● Need for tuning the system to meet various SLAs
  14. 14. Impact Of Tooling Scripts XSLT Custom Code
  15. 15. Impact Of Tooling ● ● ● ● Numerous ETL solutions fail because of the lack of tooling. Developers/Solution composers are left with manual coding of XSLT, Custom mappers, etc. Not scalable! Often requires a powerful flexible tooling platform particularly, as the system grows and matures.
  16. 16. Reference Architecture
  17. 17. Reference Architecture - Big Picture BAM ESB MB MB DSS DSS DS Enterprise DW
  18. 18. Reference Architecture - Reliable extraction ESB MB DSS Scheduled Tasks DS
  19. 19. Reference Architecture - Validate & Transform WSO2 Data Mapper Input Data Model Data Model X ESB Output Data Model Data Model Y
  20. 20. Reference Architecture - Auditing Data Policy Compliance Reports/ Dashboards Data Quality Reports/ Dashboards BAM ESB
  21. 21. Reference Architecture - Reliable Loading ESB MB DSS Enterprise DW
  22. 22. Tooling - Smooks Editor
  23. 23. Tooling - WSO2 Data Mapper
  24. 24. Demo ● Building a transformation between two simple data models using the Smooks Editor shipped with WSO2 Developer Studio.
  25. 25. Summary ● ● ● ● ETL, plays a pivotal role in any business organization. Often requires a lot of effort put into implementing a proper ETL process within an organization. Standalone ETL solutions can be costly. Re-architecting data models is made easy with WSO2 Enterprise Middleware Platform.
  26. 26. References [1] How to use the Smooks Editor shipped with WSO2 Developer Studio http://wso2. com/library/tutorials/2011/06/perform-data-mapping-smookseditor-wso2-carbon-studio/
  27. 27. Q&A