• Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,856
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
67
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Open Source ETL using Talend Open Studio Lu´ Santos ıs luis@luissantos.pt February 14, 2013Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 1
  • 2. Overview1 Who am i?2 What is ETL?3 ETL Software Suites4 Talend Open Studio for Data Integration5 Hands on6 Conclusion Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 2
  • 3. Warning!!!This presentation was created using Latex Why? Because i can! Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 3
  • 4. Who am i?Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 4
  • 5. Who am i? Software Engineer and Mathematics Student Open Source addicted PHP and Java Developer Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 5
  • 6. What is ETL?Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 6
  • 7. What is ETL? In computing, Extract, Transform and Load (ETL) refers to a process in database usage and especially in data warehousing that involves: Extracting data from outside sources Transforming it to fit operational needs (which can include quality levels) Loading it into the end target (database, more specifically, operational data store, data mart or data warehouse) (2013, http://en.wikipedia.org/wiki/Extract, transform, load) Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 7
  • 8. ETL Software Suites Pentaho Data Integration (Kettle) SQL Server Integration Services Talend Open Studio for Data Integration etc... Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 8
  • 9. Talend Open Studio for Data IntegrationTalend Open Studio is a set of tools for developing, testing, deploying andapplication integration projects. Talend Open Studio for Big Data Bonita Open Solution (BPM) Talend Open Studio for Data Integration Talend Open Studio for Data Quality Talend ESB Talend Open Studio for MDM Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 9
  • 10. Datasource(rer)sLu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 10
  • 11. Datasources (Extract and Load) Mysql, MSSQL, Oracle, Sqlite, FirebirdSQL, XLS, CSV, XML, SOAP, REST, HTTP, FTP, SSH, Imap Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 11
  • 12. TransformersLu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 12
  • 13. Transformers (Transform) Sort data Convert data Cross data between datasources Filter data Fuzzy search Normalize and Denormalize data Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 13
  • 14. Where and how ? Where ? Multi-platform ( Linux, MacOs, BSD-* even on windows ) You just need a JVM (Java Virtual Machine) Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 14
  • 15. Where and how ? Where ? Multi-platform ( Linux, MacOs, BSD-* even on windows ) You just need a JVM (Java Virtual Machine) How ? Execute it from your favorite programming language using syscalls Command line From your JVM based application (Java, Groovy, JRuby) Webservices runing on the top Java App Server (Tomcat, Glassfish) Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 14
  • 16. Hands onLu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 15
  • 17. Hands on Querying data Joining data from multiple datasources Filtering and sorting data Exporting data Deploying your job Calling it from PHP Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 16
  • 18. Database Schema Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 17
  • 19. Example Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 18
  • 20. ”With great power comes great responsability.” (Voltair)Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 19
  • 21. The End email: luis@luissantos.pt twitter: @santosluis87 linkedin: https://www.linkedin.com/in/luissantos87Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 20