Your SlideShare is downloading. ×
Pentaho Data Integration Introduction
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Pentaho Data Integration Introduction

24,892

Published on

A gentle and short introduction into Pentaho Data Integration a.k.a. Kettle

A gentle and short introduction into Pentaho Data Integration a.k.a. Kettle

Published in: Technology
4 Comments
23 Likes
Statistics
Notes
No Downloads
Views
Total Views
24,892
On Slideshare
0
From Embeds
0
Number of Embeds
19
Actions
Shares
0
Downloads
0
Comments
4
Likes
23
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1.
      Pentaho Introduction
      Matt Casters
  • 2. Matt Casters
    • Chief of Data Integration at Pentaho
      • Lead Development
      • 3. Project manager
      • 4. Community liason
    • Kettle Project Founder
    • 5. Author of Pentaho Kettle Solutions
      • Published by Wiley
      • 6. 650 pages
  • 7. Pentaho Data Integration for BI Business Intelligence! That's what we do.
  • 8. Pentaho Data Integration – Kettle K ettle E xtraction T ransportation T ransformation L oading E nvironment
  • 9. Pentaho Data Integration – Extraction
    • Extract data from :
      • 35+ database types
        • MySQL, PostgreSQL, SQLite, ...
        • 10. Oracle, SQL Server, etc
      • Text files
      • 11. XML files
      • 12. XLS files
      • 13. Xbase files (dBase, Foxpro, etc)
      • 14. File systems information
      • 15. Generated data
      • 16. MS Access files
      • 17. LDAP
      • 18. Geo-data
      • 19. ...
  • 20. Pentaho Data Integration – Transportation
    • Transportation of data
      • Engine based data transfer (no code generator)
      • 21. Very flexible pathways:
  • 27. Pentaho Data Integration – Transformation
    • Flexibly transform data
  • 35. Pentaho Data Integration – Loading
    • Load data into a target format
  • 41. Pentaho Data Integration – Environment
    • Full GUI called “Spoon” to edit every option in Kettle
    • Command line tools
      • execute jobs
      • 44. execute transformations
    • Web server
      • clustering
      • 45. remote execution
    • Programming API for Java
    • 46. Plugin eco-system
    • 47. ...
  • 48. Pentaho Data Integration – Community
    • Paying Pentaho customers
    • 49. Large and small corporations
      • All possible sectors
    • Lone rangers & Hobbiests
    • 50. All regions on Earth
    • 51. Meet on our Forum : +40,000 posts in 10,000 threads in 4 years
    • 52. Use our JIRA case tracking systems
    • 53. Download more than 10,000 copies of Kettle per month
    http://www.ohloh.net/projects/3624?p=Kettle http://www.softpedia.com/progClean/Kettle-Clean-80094.html
  • 54. Pentaho Data Integration – use-cases
    • Load data from text files and store it into a database
    • 55. Export data from database to text-file or more other databases
    • 56. Data migration between database applications
    • 57. Exploration of data in existing databases (tables, views, etc.)
    • 58. Information improvement using lookups
    • 59. Data cleaning
    • 60. Application integration
    • 61. Data warehouse population
    • 62. Application integration
    • 63. Report data generation
    • 64. ...
  • 65. Pentaho Data Integration – Adoption
    • Wide range of production deployments
      • Small and medium-sized companies
      • 66. Large enterprises
    • Rapid product evolution
      • Driven by Pentaho investment
      • 67. Includes significant community contributions
        • “ Contribution-friendly” architecture
        • 68. Natural fit for additional data sources, targets and transformations
  • 69. Pentaho Data Integration – Adoption
    • Most deployed open source data integration solution. Independent study by Mark Madsen of Third Nature and the BeyeNETWORK
    • 70. Download free study at pentaho.com
  • 71.
      Big Data
  • 72. Pentaho – Big Data
    • Enabling BI on top of big data
    • 73. From Tera-bytes to Peta-bytes
    • 74. Big Data stored in Hadoop (MapReduce) / HDFS / Hive
    • 75. Reduces complexity for developers
    • 76. Leverages standard components like Pentaho Data Integration
    • 77. Drag & drop creation of map and reduce transformations
    • 78. Cooperation with Apache
    • 79. Presentation + Demo : http://vimeo.com/14641559
  • 80. Pentaho Data Integration – Links
    • Homepage: http://kettle.pentaho.org
    • 81. Forum: http://forums.pentaho.org/forumdisplay.php?f=69
    • 82. Case tracker: http://jira.pentaho.org/browse/PDI
    • 83. Continuous Integration Server: http://ci.pentaho.com/job/Kettle
    • 84. Wiki : http://wiki.pentaho.org/ display/EAI
    • 85. IRC Channel: ##pentaho (on Freenode)
    • 86. Mailing list: http://groups.google.com/group/kettle-developers
    • 87. My blog: http://www.ibridge.be
    • 88. My coordinates: mcasters at pentaho dot org
  • 89. Pentaho Books
  • 90. Q&A
      Thank you for listening!

×