Successfully reported this slideshow.
Open Source ETL using Talend Open Studio                                    Lu´ Santos                                    ...
Overview1    Who am i?2    What is ETL?3    ETL Software Suites4    Talend Open Studio for Data Integration5    Hands on6 ...
Warning!!!This presentation was created using Latex                  Why?             Because i can!  Lu´ Santos luis@luis...
Who am i?Lu´ Santos luis@luissantos.pt  ıs                              Open Source ETL   February 14, 2013   4
Who am i?          Software Engineer and          Mathematics Student          Open Source addicted          PHP and Java ...
What is ETL?Lu´ Santos luis@luissantos.pt  ıs                               Open Source ETL   February 14, 2013   6
What is ETL?     In computing, Extract, Transform and Load (ETL) refers to a     process in database usage and especially ...
ETL Software Suites      Pentaho Data Integration (Kettle)      SQL Server Integration Services      Talend Open Studio fo...
Talend Open Studio for Data IntegrationTalend Open Studio is a set of tools for developing, testing, deploying andapplicat...
Datasource(rer)sLu´ Santos luis@luissantos.pt  ıs                                 Open Source ETL   February 14, 2013   10
Datasources (Extract and Load)  Mysql, MSSQL, Oracle, Sqlite, FirebirdSQL, XLS, CSV, XML, SOAP,                  REST, HTT...
TransformersLu´ Santos luis@luissantos.pt  ıs                               Open Source ETL   February 14, 2013   12
Transformers (Transform)      Sort data      Convert data      Cross data between datasources      Filter data      Fuzzy ...
Where and how ?     Where ?             Multi-platform ( Linux, MacOs, BSD-* even on windows )             You just need a...
Where and how ?     Where ?             Multi-platform ( Linux, MacOs, BSD-* even on windows )             You just need a...
Hands onLu´ Santos luis@luissantos.pt  ıs                             Open Source ETL   February 14, 2013   15
Hands on     Querying data     Joining data from multiple datasources     Filtering and sorting data     Exporting data   ...
Database Schema Lu´ Santos luis@luissantos.pt   ıs                            Open Source ETL   February 14, 2013   17
Example Lu´ Santos luis@luissantos.pt   ıs                            Open Source ETL   February 14, 2013   18
”With great power comes great responsability.”                                         (Voltair)Lu´ Santos luis@luissantos...
The End    email: luis@luissantos.pt    twitter: @santosluis87    linkedin: https://www.linkedin.com/in/luissantos87Lu´ Sa...
Upcoming SlideShare
Loading in …5
×

Open Source ETL using Talend Open Studio

6,718 views

Published on

Open Source ETL using Talend Open Studio

Published in: Technology
  • Be the first to comment

Open Source ETL using Talend Open Studio

  1. 1. Open Source ETL using Talend Open Studio Lu´ Santos ıs luis@luissantos.pt February 14, 2013Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 1
  2. 2. Overview1 Who am i?2 What is ETL?3 ETL Software Suites4 Talend Open Studio for Data Integration5 Hands on6 Conclusion Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 2
  3. 3. Warning!!!This presentation was created using Latex Why? Because i can! Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 3
  4. 4. Who am i?Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 4
  5. 5. Who am i? Software Engineer and Mathematics Student Open Source addicted PHP and Java Developer Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 5
  6. 6. What is ETL?Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 6
  7. 7. What is ETL? In computing, Extract, Transform and Load (ETL) refers to a process in database usage and especially in data warehousing that involves: Extracting data from outside sources Transforming it to fit operational needs (which can include quality levels) Loading it into the end target (database, more specifically, operational data store, data mart or data warehouse) (2013, http://en.wikipedia.org/wiki/Extract, transform, load) Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 7
  8. 8. ETL Software Suites Pentaho Data Integration (Kettle) SQL Server Integration Services Talend Open Studio for Data Integration etc... Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 8
  9. 9. Talend Open Studio for Data IntegrationTalend Open Studio is a set of tools for developing, testing, deploying andapplication integration projects. Talend Open Studio for Big Data Bonita Open Solution (BPM) Talend Open Studio for Data Integration Talend Open Studio for Data Quality Talend ESB Talend Open Studio for MDM Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 9
  10. 10. Datasource(rer)sLu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 10
  11. 11. Datasources (Extract and Load) Mysql, MSSQL, Oracle, Sqlite, FirebirdSQL, XLS, CSV, XML, SOAP, REST, HTTP, FTP, SSH, Imap Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 11
  12. 12. TransformersLu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 12
  13. 13. Transformers (Transform) Sort data Convert data Cross data between datasources Filter data Fuzzy search Normalize and Denormalize data Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 13
  14. 14. Where and how ? Where ? Multi-platform ( Linux, MacOs, BSD-* even on windows ) You just need a JVM (Java Virtual Machine) Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 14
  15. 15. Where and how ? Where ? Multi-platform ( Linux, MacOs, BSD-* even on windows ) You just need a JVM (Java Virtual Machine) How ? Execute it from your favorite programming language using syscalls Command line From your JVM based application (Java, Groovy, JRuby) Webservices runing on the top Java App Server (Tomcat, Glassfish) Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 14
  16. 16. Hands onLu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 15
  17. 17. Hands on Querying data Joining data from multiple datasources Filtering and sorting data Exporting data Deploying your job Calling it from PHP Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 16
  18. 18. Database Schema Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 17
  19. 19. Example Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 18
  20. 20. ”With great power comes great responsability.” (Voltair)Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 19
  21. 21. The End email: luis@luissantos.pt twitter: @santosluis87 linkedin: https://www.linkedin.com/in/luissantos87Lu´ Santos luis@luissantos.pt ıs Open Source ETL February 14, 2013 20

×