Pentaho Data Integration with Kettle

5,273 views

Published on

A course on Pentaho Data Integration with Kettle. Another interesting course on Talend is on http://www.slideshare.net/melphi_/talend-open-studio-data-integration

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,273
On SlideShare
0
From Embeds
0
Number of Embeds
1,630
Actions
Shares
0
Downloads
171
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Pentaho Data Integration with Kettle

  1. 1. Pentaho Data Integration (Kettle)
  2. 2. PDI Overview (Kettle) ● An entry-level tool for data manipulation (ETL) ● PDI (Kettle) reads procedures stored in XML format ● Spoon is a graphical tool used to develop that procedures ● Procedures are designed linking components ● Many data sources can be used, JDBC, files, web services ● JavaScript and Java support for complex routineswww.robertomarchetto.com
  3. 3. Development enviromentwww.robertomarchetto.com
  4. 4. Example, Source databasewww.robertomarchetto.com
  5. 5. Example, destination databasewww.robertomarchetto.com
  6. 6. Schema comparisonwww.robertomarchetto.com
  7. 7. Procedure users_dimensionQuery users:SELECT u.id, CONCAT(u.first_name, , u.last_name) as fullname, u.titleFROM users uWHERE u.first_name is not null and u.last_name is not nullwww.robertomarchetto.com
  8. 8. Testingwww.robertomarchetto.com
  9. 9. Procedure accounts_dimensionQuery accounts:select a.id, a.name, a.industry, a.billing_address_postalcode,a.billing_address_city, a.billing_address_countryfrom accounts awww.robertomarchetto.com
  10. 10. Procedure opportunities_factQuery opportunities:SELECT o.id, o.date_entered, o.date_closed, o.assigned_user_id,o.sales_stage, o.name, o.amountFROM opportunities oWHERE o.sales_stage in (Closed Won, Closed Lost) ORDER BY o.idwww.robertomarchetto.com
  11. 11. Procedure dates_dimensionwww.robertomarchetto.com
  12. 12. Collect procedures in a jobwww.robertomarchetto.com
  13. 13. Using JNDI ● Edit JNDI /simple-jndi/jdbc.properties or C:/Documents and Settings/<user>/.pentaho/simple- jndi/default.propertieswww.robertomarchetto.com
  14. 14. Running procedures ● Directly from Spoon ● From Pentaho BI Suite ● Using command line (Kitchen, Pan) kitchen.bat /file:D:Jobsjobname.kjb /level:Basic ● In a clustered enviroment ● Using a web services (Carte)www.robertomarchetto.com
  15. 15. Publishing on Pentahowww.robertomarchetto.com
  16. 16. Running from Pentahowww.robertomarchetto.com
  17. 17. Scheduling ● Using Pentahos scheduler ● Using an external scheduler (cron)www.robertomarchetto.com

×