Choosing the right steps in pentaho kettle


Published on

Some general slides about how to pick steps in Pentaho Data Integration (aka Kettle).

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Choosing the right steps in pentaho kettle

  1. 1. Choosing the Right Steps in Pentaho Kettle Alex Meadows BI Engineer, iContact August RTP – PUG Meetup
  2. 2. Kettle (PDI) – The ETL Swiss Army Knife <ul><li>Over 100 steps
  3. 3. Plugin Architecture
  4. 4. Scripting Steps
  5. 5. Which to use?!?!? </li></ul>
  6. 6. Example: Loading a Text File <ul><li>Text File Input, right? </li></ul><ul><ul><li>Will work for most text files
  7. 7. Most powerful of text file inputs </li></ul></ul><ul><li>There are other options in PDI!
  8. 8. Find the one that closely matches what you're trying to do </li></ul>
  9. 9. Example: Sharded Databases <ul><li>Default feature of database connections
  10. 10. Non-dynamic, so have to update as needed </li></ul>
  11. 11. Example: Sharded Databases <ul><li>Needed a dynamic sharded list
  12. 12. Built job and transformation to read from table and perform function on each shard in table </li></ul>
  13. 13. Plugins – Add More Functions <ul><li>Community contributions </li></ul><ul><ul><li>Teradata Bulk Loader
  14. 14. R/Weka Integration </li></ul></ul><ul><li>Treated as siblings of native steps </li></ul><ul><ul><li>All native steps are in essence plugins. </li></ul></ul><ul><li>Many eventually become part of the core product.
  15. 15. Processing handled directly within the engine, just like native steps </li></ul>
  16. 16. Scripting Steps <ul><li>Greatest functionality/flexibility
  17. 17. Executes/compiles at runtime
  18. 18. Can dramatically slow performance
  19. 19. If script is used in multiple places, turn it into a plugin for potentially better performance </li></ul>
  20. 20. Recommended Reading <ul><li>Pentaho Solutions (general BI audience)
  21. 21. Pentaho Data Integration Beginner's Guide (beginner)
  22. 22. Pentaho Data Integration Cookbook (intermediate)
  23. 23. Pentaho Kettle Solutions (advanced) </li></ul>