ETL with WSO2 Enterprise Middleware Platform
 

ETL with WSO2 Enterprise Middleware Platform

on

  • 1,189 views

 

Statistics

Views

Total Views
1,189
Views on SlideShare
1,080
Embed Views
109

Actions

Likes
0
Downloads
44
Comments
0

2 Embeds 109

http://wso2.com 108
http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

ETL with WSO2 Enterprise Middleware Platform ETL with WSO2 Enterprise Middleware Platform Presentation Transcript

  • ETL with WSO2 Enterprise Middleware Platform Prabath Abeysekara - Associate Technical Lead
  • Outline ● A Classic Use Case ● What’s ETL and How It Is Interpreted In The Modern World? ● Why ETL? ● Challenges In Implementing ETL Solutions ● Why Traditional Standalone ETL Products Are Considered Dead In The Modern World? ● What Factors To Be Considered When Implementing ETL In Re-Architecting A System?
  • Outline contd.. ● Impact Of Tooling ● Reference Architecture ○ How to build an “efficient, robust, scalable, auditable, performing and maintainable” ETL solution with WSO2 EMP? ● Demo - Data Mapping With WSO2 Developer Studio ● Summary ● Q&A
  • A Classic Use Case - Financial Sector Flat files Financial Reporting RDBMS ETL Process Enterprise Data Warehouse Revenue Predictions XML, Web Services Other Analytics & BI fronts
  • What’s ETL? - Traditional Interpretation ● Extract ● Transform ● Load
  • What’s ETL? - Modern Interpretation ● Extract ● Monitor ● Profile/Audit ● Analyze ● Cleanse ● Transform ● Load
  • Why ETL? ● ● Generally, to build and maintain data repositories with “single version of the truth” out of the multiple heterogenous data sources scattered across an organization or a business domain. Then, the business users can use that data for, ○ Predictive Analysis ○ Revenue predictions and comparisons ○ Monitor Overall Growth of an organization ○ Business Policies ○ Strategic Decisions
  • Challenges ● Data definition establishment ● Need for expert knowledge ● Scalability and Performance ● Business user acceptance and seamless support for wide range of business use cases ● Maintenance, Data Archival ● Real-time or Near Real-time data synchronization
  • Why Standalone ETL Products Are Dead? ● ● ● ● Modern day organizations are evolving as it’s never been before. Tendency to adopt architecture patterns such as SOA to reduce IT costs and have flexible business processes is rapidly increasing. Organizations are more focussed towards “Connected businesses”. Thus, it’s very likely that an organization might have a IT infrastructure in place already.
  • Why Standalone ETL Products Are Dead? ● ● ● ● Adopting a standalone ETL product? Possible, but worthwhile? Generally less support for open standards. Extension points? Connectors? More custom code! Usually, relies on some proprietary data integration patterns, inducing high maintenance costs. Additional licensing costs, need for separate expert/operational assistance, again inducing high maintenance costs.
  • Why Standalone ETL Products Are Dead? ● Tendency to use in-house re-usable business components leveraging the benefits of SOA ● Less operational costs ● Scalability is a main focus nowadays. ● Having a similar process implemented enables, horizontal scalability at different layers as the need arises.
  • Re-Architecting A System’s DIL? ● ● Data Integration is always cumbersome Need for ensuring policy compliance of data at its target containers. (usually Enterprise Data Warehouses, Central MDM repositories, etc) ● Flexibility ● Ensuring acceptable Performance ● What about Reliability?
  • Re-Architecting A System’s DIL? ● How to deal with the freshness of data? ● When to synchronize? ● Need for tuning the system to meet various SLAs
  • Impact Of Tooling Scripts XSLT Custom Code
  • Impact Of Tooling ● ● ● ● Numerous ETL solutions fail because of the lack of tooling. Developers/Solution composers are left with manual coding of XSLT, Custom mappers, etc. Not scalable! Often requires a powerful flexible tooling platform particularly, as the system grows and matures.
  • Reference Architecture
  • Reference Architecture - Big Picture BAM ESB MB MB DSS DSS DS Enterprise DW
  • Reference Architecture - Reliable extraction ESB MB DSS Scheduled Tasks DS
  • Reference Architecture - Validate & Transform WSO2 Data Mapper Input Data Model Data Model X ESB Output Data Model Data Model Y
  • Reference Architecture - Auditing Data Policy Compliance Reports/ Dashboards Data Quality Reports/ Dashboards BAM ESB
  • Reference Architecture - Reliable Loading ESB MB DSS Enterprise DW
  • Tooling - Smooks Editor
  • Tooling - WSO2 Data Mapper
  • Demo ● Building a transformation between two simple data models using the Smooks Editor shipped with WSO2 Developer Studio.
  • Summary ● ● ● ● ETL, plays a pivotal role in any business organization. Often requires a lot of effort put into implementing a proper ETL process within an organization. Standalone ETL solutions can be costly. Re-architecting data models is made easy with WSO2 Enterprise Middleware Platform.
  • References [1] How to use the Smooks Editor shipped with WSO2 Developer Studio http://wso2. com/library/tutorials/2011/06/perform-data-mapping-smookseditor-wso2-carbon-studio/
  • Q&A