SlideShare a Scribd company logo
1 of 31
Pentaho Data Integration
Student Name: MuhammadAyaz Farid Shah & Usama Naeem
Class: MSCS
What is business intelligence.
▪ Business intelligence is the process of transforming the business data
into information/ knowledge using computer-based techniques thus
enabling the users to take effective fact-based decision.
Business Intelligence: Need of time.
▪ What would be my insightful decision based on ocean of data? How
quick I take decision based on that huge data?  End to End BI
Solution
▪ How can I integrate heterogeneous data feeds to common platform
to analyze  ELT
▪ How to interpret raw data in best possible manner?  Data
discovery.
▪ Can I predict the future of my business trajectory?  ML, Predictive
▪ What is the best way to share the data?  visualization Reporting
▪ How can I monitor the dynamics of changing trends? Dashboard
BI essentially intended for the
following 3 things.
▪ Precise and concise interpretation of data.
▪ Identify new opportunities.
▪ Implementing an effective strategy to have competitive edge.
BI Existing Solutions.
Large BIVenders New Breed
IBM Pentaho
SAP QlikTech
Microsoft Logi
Oracle Alteryx
Data Integration Analytics
Informatica Rapidminer
Why Pentaho ?
▪ One step solution for all the business analytics need.
▪ Low integration time and infrastructure cost.
▪ Have community support .
▪ Easily Scalable.
▪ Virtually unlimited visualization and data source.
▪ And much more.
About Pentaho
▪ Pentaho is founded in 2004 at Orlando, USA.
▪ Recognized leader in business analytics and data integration.
▪ Subscription based business model.
▪ Achieved critical mass:
Over 1200 commercial customers
Over 10,000 production deployments.
Over 185 countries.
Download Pentaho BI suit form website.
www.pentaho.com or www.sourceforge.net
What is Pentaho and what is it?
▪ It is a business intelligence system.
It offers
▪ Analytics
▪ Visual data integration
▪ Reports
▪ Dashboards
▪ Data mining
▪ ELT
Pentaho
Available for
▪ Windows
▪ Linux
▪ Mac OSX
▪ Community supported.
▪ Open-source plugins available.
Pentaho Data Integration (Kettle)
▪ Kettle
Kettle  Kettle. ExtractionTransformationTransportation
and Loading tool.
▪ Extraction
▪ Transportation
▪ Transformation
▪ Loading
▪ Environment
Data integration  Challenges
▪ Data is everywhere.
▪ Data is inconsistent.
Records are different in each system.
▪ Performance issues.
Running queries to summarize data take long period.
▪ Data is never all in DataWarehouse.
Excel Sheets, New application.
What is Kettle?
▪ Batch data integration and processing tool written in java.
▪ Exists to Retrieve, Process and Load data.
▪ ETL ( Extract,Transform, and Load).
▪ Extracts data form various data sauces.
▪ Transform data
▪ From  being optimized for transaction.
▪ To  being optimized for reporting and analysis .
▪ Synchronizes the data coming form different databases.
▪ Data cleanness to remove errors.
▪ Load data into data warehouse.
Why do I need it ETL?
▪ ETL tool save time and money when developing a data warehouse by
removing the need for hand coding.
▪ It is very difficult for database administrators to connect between different
brands of databases without using an external tool.
▪ ETL is heart and soul of business intelligence(BI).
▪ Provide a graphical environment for data integration, migration, and
synchronization.
▪ Drag and drop graphic components to execute the desired task, saving
time and effort.
ETL
▪ The set of criteria that were used for the ETL tools comparison were divided into seven categories.
▪ TCO (Total cost of ownership).
Open-source products are typically free to use, but support, training, and consulting are what companies
need to pay for.
▪ Risk. ( Going over budget, Over schedule, Not completing the requirements of the customers)
▪ Ease of use. (Having a good GUI also reduces the time to train and use the tool)
▪ Support. (Nowadays all software products have support.)
▪ Speed. (Pentaho Kettle is faster)
▪ Data Quality. (Data Quality is fast, has features in its GUI)
▪ Monitoring. (Pentaho Kettle has practical monitoring tools. )
▪ Connectivity. (ETL tools transfer data to a very wide variety of Database systems, XML, and web
services.)
What is Kettle good for ?
▪ Loading data to RDBMS.
▪ Syncing two data sources.
▪ Processing data retrieved form multiple sources and pushed to multiple
destinations.
▪ Graphical manipulation of data.
▪ It has a very easy to use GUI.
Data Sources
▪ Files
▪ Databases
▪ SQL
▪ XML
▪ JASON
▪ Excel
▪ Google Analytics
Larger picture
Kettle 10 years old.
Joined Pentaho about 7 years ago.
Open source, at version 4.4
BI suite
▪ Reporting
▪ Analytics
▪ Dashboards
▪ ML (Machine Learning)
Kettle Tools
▪ Spoon ( Allows you to design transformations and jobs that can be run with
the Kettle tools)
▪ Kitchen ( Execute jobs designed by Spoon in XML or database repository)
▪ Pan (A program to execute transformations designed by spoon in XML or
database repository)
Most common uses of Kettle
▪ Data warehouse and DataMart loads.
▪ Data integration. (Changing input to desired output)
▪ Data cleansing.
▪ Data migration.
▪ Data Export.
▪ Etc.
Pentaho Data integration
▪ Transportation of data.
▪ Splitting
▪ Partitioning
▪ Merging
▪ Joining
▪ Duplicating
Pentaho
Steps for downloading and installing
Pentaho
▪ Step 1: Download Java from https://download.oracle.com
▪ Step 2: Download Pentaho from https://sourceforge.net
▪ Step 3: Create a new folder in C: Drive and give the same name as the
version of Pentaho.
▪ Step 4: Extract the Pentaho in this new folder
▪ Step 5: Now from MY COMPUTER -> Properties -> Advanced s stem
settings -> EnvironmentVariables -> New ->Variable name : JAVA_HOME
▪ Step 6: Check the JRE in CMD by typing echo %JAVA_HOME%
▪ Step 7: From Pentaho folder, run the spoon.bat as Administrator.
Step 1: Download Java
▪ Download Java from https://download.oracle.com
Step 2: Download pentaho
▪ Download pentaho from https://sourceforge.net
Step 3: Create a new folder
▪ Create a new folder in C: Drive and give the same name as the version of Pentaho.
Step 4: Extract the Pentaho
▪ Extract the Pentaho in this new folder
Step 5: Environment variables
▪ Now from MY COMPUTER -> Properties -> Advanced system settings -> Environment
Variables -> New ->Variable name : JAVA_HOME
Step 6: Check the JRE
▪ Check the JRE in CMD by typing echo %JAVA_HOME%
Step 7: Run Pentaho
▪ From Pentaho folder, run the spoon.bat as Administrator.
Pentaho ppt up
Pentaho ppt up

More Related Content

Similar to Pentaho ppt up

Enterprise Integration in a nutshell (16:9)
Enterprise Integration in a nutshell (16:9)Enterprise Integration in a nutshell (16:9)
Enterprise Integration in a nutshell (16:9)Dmytro Golodiuk
 
Why Businesses Must Adopt NetSuite ERP Data Migration
Why Businesses Must Adopt NetSuite ERP Data MigrationWhy Businesses Must Adopt NetSuite ERP Data Migration
Why Businesses Must Adopt NetSuite ERP Data MigrationJade Global
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackAnant Corporation
 
Transition to a modern data platform
Transition to a modern data platform Transition to a modern data platform
Transition to a modern data platform Michael Ghen
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Hortonworks
 
Ajith_kumar_4.3 Years_Informatica_ETL
Ajith_kumar_4.3 Years_Informatica_ETLAjith_kumar_4.3 Years_Informatica_ETL
Ajith_kumar_4.3 Years_Informatica_ETLAjith Kumar Pampatti
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesRob Winters
 
Airbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stackAirbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stackMichel Tricot
 
Performance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and morePerformance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and moreDenodo
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...DATAVERSITY
 
Big data analytics beyond beer and diapers
Big data analytics   beyond beer and diapersBig data analytics   beyond beer and diapers
Big data analytics beyond beer and diapersKai Zhao
 
Business Intelligence is more than just pretty visuals
Business Intelligence is more than just pretty visualsBusiness Intelligence is more than just pretty visuals
Business Intelligence is more than just pretty visualsVincent Woon
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Denodo
 
Best Practices and Lessons Learned on Our IBM Rational Insight Deployment
Best Practices and Lessons Learned on Our IBM Rational Insight DeploymentBest Practices and Lessons Learned on Our IBM Rational Insight Deployment
Best Practices and Lessons Learned on Our IBM Rational Insight DeploymentMarc Nehme
 
SOUG_Deployment__Automation_DB
SOUG_Deployment__Automation_DBSOUG_Deployment__Automation_DB
SOUG_Deployment__Automation_DBUniFabric
 
Using Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformUsing Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformDatabricks
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureDATAVERSITY
 

Similar to Pentaho ppt up (20)

Enterprise Integration in a nutshell (16:9)
Enterprise Integration in a nutshell (16:9)Enterprise Integration in a nutshell (16:9)
Enterprise Integration in a nutshell (16:9)
 
Why Businesses Must Adopt NetSuite ERP Data Migration
Why Businesses Must Adopt NetSuite ERP Data MigrationWhy Businesses Must Adopt NetSuite ERP Data Migration
Why Businesses Must Adopt NetSuite ERP Data Migration
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
Transition to a modern data platform
Transition to a modern data platform Transition to a modern data platform
Transition to a modern data platform
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
Ajith_kumar_4.3 Years_Informatica_ETL
Ajith_kumar_4.3 Years_Informatica_ETLAjith_kumar_4.3 Years_Informatica_ETL
Ajith_kumar_4.3 Years_Informatica_ETL
 
Big Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil GamesBig Data at a Gaming Company: Spil Games
Big Data at a Gaming Company: Spil Games
 
Airbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stackAirbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stack
 
Performance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and morePerformance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and more
 
Gowthami_Resume
Gowthami_ResumeGowthami_Resume
Gowthami_Resume
 
Resume_Sita_Ramadas_akkineni
Resume_Sita_Ramadas_akkineniResume_Sita_Ramadas_akkineni
Resume_Sita_Ramadas_akkineni
 
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
ADV Slides: Platforming Your Data for Success – Databases, Hadoop, Managed Ha...
 
Big data analytics beyond beer and diapers
Big data analytics   beyond beer and diapersBig data analytics   beyond beer and diapers
Big data analytics beyond beer and diapers
 
Business Intelligence is more than just pretty visuals
Business Intelligence is more than just pretty visualsBusiness Intelligence is more than just pretty visuals
Business Intelligence is more than just pretty visuals
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
Accelerate Self-Service Analytics with Virtualization and Visualisation (Thai)
 
Best Practices and Lessons Learned on Our IBM Rational Insight Deployment
Best Practices and Lessons Learned on Our IBM Rational Insight DeploymentBest Practices and Lessons Learned on Our IBM Rational Insight Deployment
Best Practices and Lessons Learned on Our IBM Rational Insight Deployment
 
SOUG_Deployment__Automation_DB
SOUG_Deployment__Automation_DBSOUG_Deployment__Automation_DB
SOUG_Deployment__Automation_DB
 
Using Databricks as an Analysis Platform
Using Databricks as an Analysis PlatformUsing Databricks as an Analysis Platform
Using Databricks as an Analysis Platform
 
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data ArchitectureADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
 

Recently uploaded

Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxAnaBeatriceAblay2
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfMahmoud M. Sallam
 

Recently uploaded (20)

Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Pharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdfPharmacognosy Flower 3. Compositae 2023.pdf
Pharmacognosy Flower 3. Compositae 2023.pdf
 

Pentaho ppt up

  • 1. Pentaho Data Integration Student Name: MuhammadAyaz Farid Shah & Usama Naeem Class: MSCS
  • 2. What is business intelligence. ▪ Business intelligence is the process of transforming the business data into information/ knowledge using computer-based techniques thus enabling the users to take effective fact-based decision.
  • 3. Business Intelligence: Need of time. ▪ What would be my insightful decision based on ocean of data? How quick I take decision based on that huge data?  End to End BI Solution ▪ How can I integrate heterogeneous data feeds to common platform to analyze  ELT ▪ How to interpret raw data in best possible manner?  Data discovery. ▪ Can I predict the future of my business trajectory?  ML, Predictive ▪ What is the best way to share the data?  visualization Reporting ▪ How can I monitor the dynamics of changing trends? Dashboard
  • 4. BI essentially intended for the following 3 things. ▪ Precise and concise interpretation of data. ▪ Identify new opportunities. ▪ Implementing an effective strategy to have competitive edge.
  • 5. BI Existing Solutions. Large BIVenders New Breed IBM Pentaho SAP QlikTech Microsoft Logi Oracle Alteryx Data Integration Analytics Informatica Rapidminer
  • 6. Why Pentaho ? ▪ One step solution for all the business analytics need. ▪ Low integration time and infrastructure cost. ▪ Have community support . ▪ Easily Scalable. ▪ Virtually unlimited visualization and data source. ▪ And much more.
  • 7. About Pentaho ▪ Pentaho is founded in 2004 at Orlando, USA. ▪ Recognized leader in business analytics and data integration. ▪ Subscription based business model. ▪ Achieved critical mass: Over 1200 commercial customers Over 10,000 production deployments. Over 185 countries. Download Pentaho BI suit form website. www.pentaho.com or www.sourceforge.net
  • 8. What is Pentaho and what is it? ▪ It is a business intelligence system. It offers ▪ Analytics ▪ Visual data integration ▪ Reports ▪ Dashboards ▪ Data mining ▪ ELT
  • 9. Pentaho Available for ▪ Windows ▪ Linux ▪ Mac OSX ▪ Community supported. ▪ Open-source plugins available.
  • 10. Pentaho Data Integration (Kettle) ▪ Kettle Kettle  Kettle. ExtractionTransformationTransportation and Loading tool. ▪ Extraction ▪ Transportation ▪ Transformation ▪ Loading ▪ Environment
  • 11. Data integration  Challenges ▪ Data is everywhere. ▪ Data is inconsistent. Records are different in each system. ▪ Performance issues. Running queries to summarize data take long period. ▪ Data is never all in DataWarehouse. Excel Sheets, New application.
  • 12. What is Kettle? ▪ Batch data integration and processing tool written in java. ▪ Exists to Retrieve, Process and Load data. ▪ ETL ( Extract,Transform, and Load). ▪ Extracts data form various data sauces. ▪ Transform data ▪ From  being optimized for transaction. ▪ To  being optimized for reporting and analysis . ▪ Synchronizes the data coming form different databases. ▪ Data cleanness to remove errors. ▪ Load data into data warehouse.
  • 13. Why do I need it ETL? ▪ ETL tool save time and money when developing a data warehouse by removing the need for hand coding. ▪ It is very difficult for database administrators to connect between different brands of databases without using an external tool. ▪ ETL is heart and soul of business intelligence(BI). ▪ Provide a graphical environment for data integration, migration, and synchronization. ▪ Drag and drop graphic components to execute the desired task, saving time and effort.
  • 14. ETL ▪ The set of criteria that were used for the ETL tools comparison were divided into seven categories. ▪ TCO (Total cost of ownership). Open-source products are typically free to use, but support, training, and consulting are what companies need to pay for. ▪ Risk. ( Going over budget, Over schedule, Not completing the requirements of the customers) ▪ Ease of use. (Having a good GUI also reduces the time to train and use the tool) ▪ Support. (Nowadays all software products have support.) ▪ Speed. (Pentaho Kettle is faster) ▪ Data Quality. (Data Quality is fast, has features in its GUI) ▪ Monitoring. (Pentaho Kettle has practical monitoring tools. ) ▪ Connectivity. (ETL tools transfer data to a very wide variety of Database systems, XML, and web services.)
  • 15. What is Kettle good for ? ▪ Loading data to RDBMS. ▪ Syncing two data sources. ▪ Processing data retrieved form multiple sources and pushed to multiple destinations. ▪ Graphical manipulation of data. ▪ It has a very easy to use GUI.
  • 16. Data Sources ▪ Files ▪ Databases ▪ SQL ▪ XML ▪ JASON ▪ Excel ▪ Google Analytics
  • 17. Larger picture Kettle 10 years old. Joined Pentaho about 7 years ago. Open source, at version 4.4 BI suite ▪ Reporting ▪ Analytics ▪ Dashboards ▪ ML (Machine Learning)
  • 18. Kettle Tools ▪ Spoon ( Allows you to design transformations and jobs that can be run with the Kettle tools) ▪ Kitchen ( Execute jobs designed by Spoon in XML or database repository) ▪ Pan (A program to execute transformations designed by spoon in XML or database repository)
  • 19. Most common uses of Kettle ▪ Data warehouse and DataMart loads. ▪ Data integration. (Changing input to desired output) ▪ Data cleansing. ▪ Data migration. ▪ Data Export. ▪ Etc.
  • 20. Pentaho Data integration ▪ Transportation of data. ▪ Splitting ▪ Partitioning ▪ Merging ▪ Joining ▪ Duplicating
  • 22. Steps for downloading and installing Pentaho ▪ Step 1: Download Java from https://download.oracle.com ▪ Step 2: Download Pentaho from https://sourceforge.net ▪ Step 3: Create a new folder in C: Drive and give the same name as the version of Pentaho. ▪ Step 4: Extract the Pentaho in this new folder ▪ Step 5: Now from MY COMPUTER -> Properties -> Advanced s stem settings -> EnvironmentVariables -> New ->Variable name : JAVA_HOME ▪ Step 6: Check the JRE in CMD by typing echo %JAVA_HOME% ▪ Step 7: From Pentaho folder, run the spoon.bat as Administrator.
  • 23. Step 1: Download Java ▪ Download Java from https://download.oracle.com
  • 24. Step 2: Download pentaho ▪ Download pentaho from https://sourceforge.net
  • 25. Step 3: Create a new folder ▪ Create a new folder in C: Drive and give the same name as the version of Pentaho.
  • 26. Step 4: Extract the Pentaho ▪ Extract the Pentaho in this new folder
  • 27. Step 5: Environment variables ▪ Now from MY COMPUTER -> Properties -> Advanced system settings -> Environment Variables -> New ->Variable name : JAVA_HOME
  • 28. Step 6: Check the JRE ▪ Check the JRE in CMD by typing echo %JAVA_HOME%
  • 29. Step 7: Run Pentaho ▪ From Pentaho folder, run the spoon.bat as Administrator.