Your SlideShare is downloading. ×
0
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Pentaho Data Integration Introduction
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Pentaho Data Integration Introduction

25,993

Published on

A gentle and short introduction into Pentaho Data Integration a.k.a. Kettle

A gentle and short introduction into Pentaho Data Integration a.k.a. Kettle

Published in: Technology
4 Comments
26 Likes
Statistics
Notes
No Downloads
Views
Total Views
25,993
On Slideshare
0
From Embeds
0
Number of Embeds
19
Actions
Shares
0
Downloads
0
Comments
4
Likes
26
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. <ul>Pentaho Introduction </ul><ul>Matt Casters </ul>
  • 2. Matt Casters <ul><li>Chief of Data Integration at Pentaho </li><ul><li>Lead Development
  • 3. Project manager
  • 4. Community liason </li></ul><li>Kettle Project Founder
  • 5. Author of Pentaho Kettle Solutions </li><ul><li>Published by Wiley
  • 6. 650 pages </li></ul></ul>
  • 7. Pentaho Data Integration for BI Business Intelligence! That's what we do.
  • 8. Pentaho Data Integration – Kettle K ettle E xtraction T ransportation T ransformation L oading E nvironment
  • 9. Pentaho Data Integration – Extraction <ul><li>Extract data from : </li></ul><ul><ul><li>35+ database types </li><ul><li>MySQL, PostgreSQL, SQLite, ...
  • 10. Oracle, SQL Server, etc </li></ul><li>Text files
  • 11. XML files
  • 12. XLS files
  • 13. Xbase files (dBase, Foxpro, etc)
  • 14. File systems information
  • 15. Generated data
  • 16. MS Access files
  • 17. LDAP
  • 18. Geo-data
  • 19. ... </li></ul></ul>
  • 20. Pentaho Data Integration – Transportation <ul><li>Transportation of data </li></ul><ul><ul><li>Engine based data transfer (no code generator)
  • 21. Very flexible pathways: </li><ul><li>splitting
  • 22. partitioning
  • 23. merging
  • 24. joining
  • 25. duplicating
  • 26. clustering (MPP) </li></ul></ul></ul>
  • 27. Pentaho Data Integration – Transformation <ul><li>Flexibly transform data </li></ul><ul><ul><li>Looking up data </li><ul><li>databases
  • 28. files
  • 29. memory... </li></ul><li>Calculating
  • 30. Scripting </li><ul><li>JavaScript, SQL, RegExp </li></ul><li>Splitting
  • 31. Mapping
  • 32. Selecting
  • 33. Filtering
  • 34. Pivotting ... </li></ul></ul>
  • 35. Pentaho Data Integration – Loading <ul><li>Load data into a target format </li></ul><ul><ul><li>Database loads
  • 36. Data warehouse population
  • 37. Partitioned loading
  • 38. Bulk loading
  • 39. Parallel loading
  • 40. Clustering </li></ul></ul>
  • 41. Pentaho Data Integration – Environment <ul><li>Full GUI called “Spoon” to edit every option in Kettle </li></ul><ul><ul><li>Drag & Drop
  • 42. Debugger
  • 43. Rich GUI </li></ul></ul><ul><li>Command line tools </li></ul><ul><ul><li>execute jobs
  • 44. execute transformations </li></ul></ul><ul><li>Web server </li></ul><ul><ul><li>clustering
  • 45. remote execution </li></ul></ul><ul><li>Programming API for Java
  • 46. Plugin eco-system
  • 47. ... </li></ul>
  • 48. Pentaho Data Integration – Community <ul><li>Paying Pentaho customers
  • 49. Large and small corporations </li></ul><ul><ul><li>All possible sectors </li></ul></ul><ul><li>Lone rangers & Hobbiests
  • 50. All regions on Earth
  • 51. Meet on our Forum : +40,000 posts in 10,000 threads in 4 years
  • 52. Use our JIRA case tracking systems
  • 53. Download more than 10,000 copies of Kettle per month </li></ul>http://www.ohloh.net/projects/3624?p=Kettle http://www.softpedia.com/progClean/Kettle-Clean-80094.html
  • 54. Pentaho Data Integration – use-cases <ul><li>Load data from text files and store it into a database
  • 55. Export data from database to text-file or more other databases
  • 56. Data migration between database applications
  • 57. Exploration of data in existing databases (tables, views, etc.)
  • 58. Information improvement using lookups
  • 59. Data cleaning
  • 60. Application integration
  • 61. Data warehouse population
  • 62. Application integration
  • 63. Report data generation
  • 64. ... </li></ul>
  • 65. Pentaho Data Integration – Adoption <ul><li>Wide range of production deployments </li></ul><ul><ul><li>Small and medium-sized companies
  • 66. Large enterprises </li></ul></ul><ul><li>Rapid product evolution </li></ul><ul><ul><li>Driven by Pentaho investment
  • 67. Includes significant community contributions </li><ul><li>“ Contribution-friendly” architecture
  • 68. Natural fit for additional data sources, targets and transformations </li></ul></ul></ul>
  • 69. Pentaho Data Integration – Adoption <ul><li>Most deployed open source data integration solution. Independent study by Mark Madsen of Third Nature and the BeyeNETWORK
  • 70. Download free study at pentaho.com </li></ul>
  • 71. <ul>Big Data </ul>
  • 72. Pentaho – Big Data <ul><li>Enabling BI on top of big data
  • 73. From Tera-bytes to Peta-bytes
  • 74. Big Data stored in Hadoop (MapReduce) / HDFS / Hive
  • 75. Reduces complexity for developers
  • 76. Leverages standard components like Pentaho Data Integration
  • 77. Drag & drop creation of map and reduce transformations
  • 78. Cooperation with Apache
  • 79. Presentation + Demo : http://vimeo.com/14641559 </li></ul>
  • 80. Pentaho Data Integration – Links <ul><li>Homepage: http://kettle.pentaho.org
  • 81. Forum: http://forums.pentaho.org/forumdisplay.php?f=69
  • 82. Case tracker: http://jira.pentaho.org/browse/PDI
  • 83. Continuous Integration Server: http://ci.pentaho.com/job/Kettle
  • 84. Wiki : http://wiki.pentaho.org/ display/EAI
  • 85. IRC Channel: ##pentaho (on Freenode)
  • 86. Mailing list: http://groups.google.com/group/kettle-developers
  • 87. My blog: http://www.ibridge.be
  • 88. My coordinates: mcasters at pentaho dot org </li></ul>
  • 89. Pentaho Books
  • 90. Q&A <ul>Thank you for listening! </ul>

×