Submit Search
Upload
Pentaho Data Integration Introduction
•
48 likes
•
32,611 views
M
mattcasters
Follow
A gentle and short introduction into Pentaho Data Integration a.k.a. Kettle
Read less
Read more
Technology
Slideshow view
Report
Share
Slideshow view
Report
Share
1 of 18
Recommended
Introduction To Pentaho
Introduction To Pentaho
DataminingTools Inc
Pentaho-BI
Pentaho-BI
Edureka!
Pentaho etl-tool
Pentaho etl-tool
Sreenivas Kappala
Pentaho
Pentaho
teza123
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
Intro for Power BI
Intro for Power BI
Martin X
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
Introduction to Data Engineering
Introduction to Data Engineering
Hadi Fadlallah
Recommended
Introduction To Pentaho
Introduction To Pentaho
DataminingTools Inc
Pentaho-BI
Pentaho-BI
Edureka!
Pentaho etl-tool
Pentaho etl-tool
Sreenivas Kappala
Pentaho
Pentaho
teza123
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
Intro for Power BI
Intro for Power BI
Martin X
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
Introduction to Data Engineering
Introduction to Data Engineering
Hadi Fadlallah
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Cathrine Wilhelmsen
Data Mesh for Dinner
Data Mesh for Dinner
Kent Graziano
Pentaho | Data Integration & Report designer
Pentaho | Data Integration & Report designer
Hamdi Hmidi
The delta architecture
The delta architecture
Prakash Chockalingam
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
DATAVERSITY
Liberating data with Talend Data Catalog
Liberating data with Talend Data Catalog
Jean-Michel Franco
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
James Serra
Stream Processing – Concepts and Frameworks
Stream Processing – Concepts and Frameworks
Guido Schmutz
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data Intelligence
Alation
Big data architectures and the data lake
Big data architectures and the data lake
James Serra
Rise of the Data Cloud
Rise of the Data Cloud
Kent Graziano
Data Architecture Brief Overview
Data Architecture Brief Overview
Hal Kalechofsky
The ABCs of Treating Data as Product
The ABCs of Treating Data as Product
DATAVERSITY
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
Databricks
From Data Warehouse to Lakehouse
From Data Warehouse to Lakehouse
Modern Data Stack France
Time to Talk about Data Mesh
Time to Talk about Data Mesh
LibbySchulze
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
Introduction to Data Engineering
Introduction to Data Engineering
Durga Gadiraju
Introduction to power BI
Introduction to power BI
Ramar Bose
Pentaho data integration 4.0 and my sql
Pentaho data integration 4.0 and my sql
AHMED ENNAJI
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
MetroStar
More Related Content
What's hot
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Cathrine Wilhelmsen
Data Mesh for Dinner
Data Mesh for Dinner
Kent Graziano
Pentaho | Data Integration & Report designer
Pentaho | Data Integration & Report designer
Hamdi Hmidi
The delta architecture
The delta architecture
Prakash Chockalingam
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
DATAVERSITY
Liberating data with Talend Data Catalog
Liberating data with Talend Data Catalog
Jean-Michel Franco
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
James Serra
Stream Processing – Concepts and Frameworks
Stream Processing – Concepts and Frameworks
Guido Schmutz
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data Intelligence
Alation
Big data architectures and the data lake
Big data architectures and the data lake
James Serra
Rise of the Data Cloud
Rise of the Data Cloud
Kent Graziano
Data Architecture Brief Overview
Data Architecture Brief Overview
Hal Kalechofsky
The ABCs of Treating Data as Product
The ABCs of Treating Data as Product
DATAVERSITY
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
Databricks
From Data Warehouse to Lakehouse
From Data Warehouse to Lakehouse
Modern Data Stack France
Time to Talk about Data Mesh
Time to Talk about Data Mesh
LibbySchulze
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Dmitry Anoshin
Introduction to Data Engineering
Introduction to Data Engineering
Durga Gadiraju
Introduction to power BI
Introduction to power BI
Ramar Bose
What's hot
(20)
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Data Mesh for Dinner
Data Mesh for Dinner
Pentaho | Data Integration & Report designer
Pentaho | Data Integration & Report designer
The delta architecture
The delta architecture
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
Liberating data with Talend Data Catalog
Liberating data with Talend Data Catalog
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
Stream Processing – Concepts and Frameworks
Stream Processing – Concepts and Frameworks
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data Intelligence
Big data architectures and the data lake
Big data architectures and the data lake
Rise of the Data Cloud
Rise of the Data Cloud
Data Architecture Brief Overview
Data Architecture Brief Overview
The ABCs of Treating Data as Product
The ABCs of Treating Data as Product
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
From Data Warehouse to Lakehouse
From Data Warehouse to Lakehouse
Time to Talk about Data Mesh
Time to Talk about Data Mesh
Building Modern Data Platform with Microsoft Azure
Building Modern Data Platform with Microsoft Azure
Introduction to Data Engineering
Introduction to Data Engineering
Introduction to power BI
Introduction to power BI
Similar to Pentaho Data Integration Introduction
Pentaho data integration 4.0 and my sql
Pentaho data integration 4.0 and my sql
AHMED ENNAJI
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
MetroStar
Big Data Session 1.pptx
Big Data Session 1.pptx
ElsonPaul2
Datalake Architecture
Datalake Architecture
TechYugadi IT Solutions & Consulting
Big Data Analytics: From SQL to Machine Learning and Graph Analysis
Big Data Analytics: From SQL to Machine Learning and Graph Analysis
Yuanyuan Tian
Trivadis Azure Data Lake
Trivadis Azure Data Lake
Trivadis
Introduction Big Data
Introduction Big Data
Frank Kienle
INF2190_W1_2016_public
INF2190_W1_2016_public
Attila Barta
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Khalid Salama
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
James Serra
Building big data solutions on azure
Building big data solutions on azure
Eyal Ben Ivri
Meeting today’s dissemination challenges – Implementing International Standar...
Meeting today’s dissemination challenges – Implementing International Standar...
Jonathan Challener
Big data and oracle
Big data and oracle
Sourabh Saxena
Qo Introduction V2
Qo Introduction V2
Joe_F
Hd insight overview
Hd insight overview
vhrocca
Eclipse day Sydney 2014 BIG data presentation
Eclipse day Sydney 2014 BIG data presentation
Sai Paravastu
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
Moacyr Passador
An Overview of VIEW
An Overview of VIEW
Shiyong Lu
INFOGOV14 - Trusting Your KM & ECM Strategy to SharePoint
INFOGOV14 - Trusting Your KM & ECM Strategy to SharePoint
Jonathan Ralton
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APS
Stéphane Fréchette
Similar to Pentaho Data Integration Introduction
(20)
Pentaho data integration 4.0 and my sql
Pentaho data integration 4.0 and my sql
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
Big Data Session 1.pptx
Big Data Session 1.pptx
Datalake Architecture
Datalake Architecture
Big Data Analytics: From SQL to Machine Learning and Graph Analysis
Big Data Analytics: From SQL to Machine Learning and Graph Analysis
Trivadis Azure Data Lake
Trivadis Azure Data Lake
Introduction Big Data
Introduction Big Data
INF2190_W1_2016_public
INF2190_W1_2016_public
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
Building big data solutions on azure
Building big data solutions on azure
Meeting today’s dissemination challenges – Implementing International Standar...
Meeting today’s dissemination challenges – Implementing International Standar...
Big data and oracle
Big data and oracle
Qo Introduction V2
Qo Introduction V2
Hd insight overview
Hd insight overview
Eclipse day Sydney 2014 BIG data presentation
Eclipse day Sydney 2014 BIG data presentation
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
An Overview of VIEW
An Overview of VIEW
INFOGOV14 - Trusting Your KM & ECM Strategy to SharePoint
INFOGOV14 - Trusting Your KM & ECM Strategy to SharePoint
Modernizing Your Data Warehouse using APS
Modernizing Your Data Warehouse using APS
Recently uploaded
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Paige Cruz
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
danishmna97
Working together SRE & Platform Engineering
Working together SRE & Platform Engineering
Marcus Vechiato
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
FIDO Alliance
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
IES VE
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
Pixlogix Infotech
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
Srushith Repakula
2024 May Patch Tuesday
2024 May Patch Tuesday
Ivanti
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
FIDO Alliance
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
Paolo Missier
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
ScyllaDB
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
GDSC PJATK
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
Safe Software
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
BrainSell Technologies
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
Memoori
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
Samy Fodil
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
Syngulon
Recently uploaded
(20)
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
Working together SRE & Platform Engineering
Working together SRE & Platform Engineering
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
Using IESVE for Room Loads Analysis - UK & Ireland
Using IESVE for Room Loads Analysis - UK & Ireland
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
2024 May Patch Tuesday
2024 May Patch Tuesday
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
Pentaho Data Integration Introduction
1.
2.
3.
Project manager
4.
5.
6.
650 pages
7.
Pentaho Data Integration
for BI Business Intelligence! That's what we do.
8.
Pentaho Data Integration
– Kettle K ettle E xtraction T ransportation T ransformation L oading E nvironment
9.
10.
11.
XML files
12.
XLS files
13.
Xbase files (dBase,
Foxpro, etc)
14.
File systems information
15.
Generated data
16.
MS Access files
17.
LDAP
18.
Geo-data
19.
...
20.
21.
22.
partitioning
23.
merging
24.
joining
25.
duplicating
26.
clustering (MPP)
27.
28.
files
29.
30.
31.
Mapping
32.
Selecting
33.
Filtering
34.
Pivotting ...
35.
36.
Data warehouse population
37.
Partitioned loading
38.
Bulk loading
39.
Parallel loading
40.
Clustering
41.
42.
Debugger
43.
44.
45.
46.
Plugin eco-system
47.
...
48.
49.
50.
All regions on
Earth
51.
Meet on our
Forum : +40,000 posts in 10,000 threads in 4 years
52.
Use our JIRA
case tracking systems
53.
Download more than
10,000 copies of Kettle per month http://www.ohloh.net/projects/3624?p=Kettle http://www.softpedia.com/progClean/Kettle-Clean-80094.html
54.
55.
Export data from
database to text-file or more other databases
56.
Data migration between
database applications
57.
Exploration of data
in existing databases (tables, views, etc.)
58.
Information improvement using
lookups
59.
Data cleaning
60.
Application integration
61.
Data warehouse population
62.
Application integration
63.
Report data generation
64.
...
65.
66.
67.
68.
Natural fit for
additional data sources, targets and transformations
69.
70.
Download free study
at pentaho.com
71.
72.
73.
From Tera-bytes to
Peta-bytes
74.
Big Data stored
in Hadoop (MapReduce) / HDFS / Hive
75.
Reduces complexity for
developers
76.
Leverages standard components
like Pentaho Data Integration
77.
Drag & drop
creation of map and reduce transformations
78.
Cooperation with Apache
79.
Presentation + Demo
: http://vimeo.com/14641559
80.
81.
Forum: http://forums.pentaho.org/forumdisplay.php?f=69
82.
Case tracker:
http://jira.pentaho.org/browse/PDI
83.
Continuous Integration Server:
http://ci.pentaho.com/job/Kettle
84.
Wiki :
http://wiki.pentaho.org/ display/EAI
85.
IRC Channel: ##pentaho
(on Freenode)
86.
Mailing list:
http://groups.google.com/group/kettle-developers
87.
My blog:
http://www.ibridge.be
88.
My coordinates: mcasters
at pentaho dot org
89.
Pentaho Books
90.