SlideShare a Scribd company logo
ETL Tools
21 June 2017 www.snipe.co.in 2
Prepared :Snipe Team
Agenda
Table of Contents
Introduction
What do ETL tools do?
Why use an ETL tool?
ETL tools
Comparison
Conclusion
Overview
What do ETLs tool do?
An ETL tool is a tool that:
Extracts data from various data sources (usually legacy)
Transforms data
from -> being optimized for transaction
to -> being optimized for reporting and analysis
synchronizes the data coming from different databases
data cleanses to remove errors
Loads data into a data warehouse
Overview
Why use an ETL tool?
ETL tools save time and money when developing a data warehouse by
removing the need for “hand-coding”.
“Hand Coding” is still the most common way of integrating data today. It
requires hours and hours of development and expertise to create a
Business-Intelligence-System.
It is very difficult for data base administrators to connect between
different brands of databases without using an external tool.
In the event that databases are altered or new databases need to be
integrated, a lot of “hand-coded” work needs to be completely redone.
Overview
How to Implement ETL System
Tools
ETL Tools
Pentaho Kettle
Pentaho is a commercial open-source BI suite that has a product called
Kettle for data integration.
It uses an innovative meta-driven approach and has a strong and very
easy-to-use GUI
The company started around 2001
It has a strong community of 13,500 registered users
It uses a stand-alone java engine that process the tasks for moving data
between many different databases and files
Tools
ETL Tools
Talend
Talend is an open-source data integration tool
It uses a code-generating approach and uses a GUI (implemented in
Eclipse RC)
It started around October 2006
It has a much smaller community then Pentaho, but is supported by 2
finance companies
It generates Java code or Perl code which can later be run on a server
Tools
ETL Tools
Informatica PowerCenter
Informatica has a very good commercial data integration suite
It was founded in 1993
It is the market share leader in data integration (Gartner Dataquest)
It has 2600 customers. Of those, there are fortune 100 companies,
companies listed on the Dow Jones and government organization
The company's sole focus is data integration
It has quite a big package for enterprises to integrate their systems,
cleanse their data and can connect to a vast number of current and legacy
systems
Open Source Tools
ETL Tools
Inaplex Inaport
Inaplex is a small UK company
InaPlex is a producer of Customer Data Integration products for mid-
market CRM solutions
Inaplex mainly focuses on providing simple solutions for it’s customers to
integrate their data into CRM and accounting software like Sage and
Goldmine
Comparison
Type
IBM (Information Server Infosphere platform)
Advantages:
strongest vision on the market, flexibility
progress towards common metadata platform
high level of satisfaction from clients and a variety of initiatives
Disadvantages:
difficult learning curve
long implementation cycles
became very heavy (lots of GBs) with version 8.x and requires a lot of
processing power
Type
Informatica PowerCenter
Advantages:
most substantial size and resources on the market of data integration
tools vendors
consistent track record, solid technology, straightforward learning
curve, ability to address real-time data integration schemes
Informatica is highly specialized in ETL and Data Integration and
focuses on those topics, not on BI as a whole
focus on B2B data exchange
Disadvantages:
several partnerships diminishing the value of technologies
limited experience in the field.
Type
Microsoft (SQL Server Integration Services)
Advantages:
broad documentation and support, best practices to data warehouses
ease and speed of implementation
standardized data integration
real-time, message-based capabilities
relatively low cost - excellent support and distribution model
Disadvantages:
problems in non-Windows environments. Takes over all Microsoft
Windows limitations.
unclear vision and strategy
Type
Oracle (OWB and ODI)
Advantages:
based on Oracle Warehouse Builder and Oracle Data Integrator – two
very powerful tools;
tight connection to all Oracle datawarehousing applications;
tendency to integrate all tools into one application and one environment.
Disadvantages:
focus on ETL solutions, rather than in an open context of data
management;
tools are used mostly for batch-oriented work, transformation rather
than real-time processes or federation data delivery;
long-awaited bond between OWB and ODI brought only promises -
customers confused in the functionality area and the future is uncertain
Type
SAP BusinessObjects (Data Integrator / Data
Services)
Advantages:
integration with SAP
SAP Business Objects created a firm company determined to stir the
market;
Good data modeling and data-management support;
SAP Business Objects provides tools for data mining and quality;
profiling due to many acquisitions of other companies.
Quick learning curve and ease of use
Disadvantages:
SAP Business Objects is seen as two different companies
Uncertain future. Controversy over deciding which method of delivering
data integration to use (SAP BW or BODI).
BusinessObjects Data Integrator (Data Services) may not be seen as a
Types
SAS
Advantages:
experienced company, great support and most of all very powerful data
integration tool with lots of multi-management features
can work on many operating systems and gather data through number of
sources – very flexible
great support for the business-class companies as well for those medium
and minor ones
Disadvantages:
misplaced sales force, company is not well recognized
SAS has to extend influences to reach non-BI community
Costly
Types
Sun Microsystems
Advantages:
Data integration tools are a part of huge Java Composite Application
Platform Suite - very flexible with ongoing development of the products
'Single-view' services draw together data from variety of sources; small
set of vendors with a strong vision
Disadvantages:
relative weakness in bulk data movement
limited mindshare in the market
support and services rated below adequate
Types
Sybase
Advantages:
assembled a range of capabilities to be able to address a mulitude of
data delivery styles
size and global presence of Sybase create opportunities in the market
pragmatic near-term strategy - better of current market demand
broad partnerships with other data quality and data integration tools
vendors
Disadvantages:
falls behind market leaders and large vendors
gaps in many aspects of data management
Types
Syncsort
Advantages:
functionality; well-known brand on the market (40 years experience);
loyalimplementation, strong performance, targeted functionality and
lower costs customer and experience base;
easy
Disadvantages:
struggle with gaining mind share in the market
lack of support for other than ETL delivery styles
unsatisfactory with lack of capability of professional services
Types
Tibco Software
Advantages:
message-oriented application integration; capabilities based on common
SOA structures;
support for federated views; easy implementation, support
andperformance
Disadvantages:
scarce references from customers; not widely enough recognised for
data integration competencies
lacking in data quality capabilities.
Comparison
Pentaho Kettle vs Talend
Pentaho
Pentaho is a commerical open-source BI suite that has a product called
Kettle for data integration.
It uses an innovative meta-driven approach and has a strong and very
easy-to-use GUI.
The company started around 2001 (2002 was when kettle was integrated
into it).
It has a strong community of 13,500 registered users.
It has a stand-alone java engine that process the jobs and tasks for
moving data between many different databases and files.
It can schedule tasks (but you need a schedular for that - cron).
It can run remote jobs on "slave servers" on other machines.
It has data quality features: from its own GUI, writing more customised
SQL queries, Javascript and regular expressions.
Conclusion
Conclusion
Informatica and Pentaho have very good products.
Informatica has a far more extensive range of products, but compared
to Pentaho is very expensive.
Pentaho has proved that it can handle small to large scale systems.
Pentaho is gaining fast momentum with businesses that would not have
considered using open source products before.
ETL

More Related Content

What's hot

Etl techniques
Etl techniquesEtl techniques
Etl techniques
mahezabeenIlkal
 
Informatica PowerCenter
Informatica PowerCenterInformatica PowerCenter
Informatica PowerCenter
Ramy Mahrous
 
Open Source ETL vs Commercial ETL
Open Source ETL vs Commercial ETLOpen Source ETL vs Commercial ETL
Open Source ETL vs Commercial ETL
Jonathan Levin
 
ETL Process
ETL ProcessETL Process
ETL Process
Rashmi Bhat
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
Databricks
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
Databricks
 
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Edureka!
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
Databricks
 
Ppt
PptPpt
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
Ivo Andreev
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
Vivek Aanand Ganesan
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
DataScienceConferenc1
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
Adam Doyle
 
ELT vs. ETL - How they’re different and why it matters
ELT vs. ETL - How they’re different and why it mattersELT vs. ETL - How they’re different and why it matters
ELT vs. ETL - How they’re different and why it matters
Matillion
 
Data warehousing
Data warehousingData warehousing
Data warehousing
Juhi Mahajan
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
James Serra
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
James Serra
 
ETL Process
ETL ProcessETL Process
ETL Process
Karthik Selvaraj
 
What is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data WharehouseWhat is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data Wharehouse
BugRaptors
 

What's hot (20)

Etl techniques
Etl techniquesEtl techniques
Etl techniques
 
Informatica PowerCenter
Informatica PowerCenterInformatica PowerCenter
Informatica PowerCenter
 
Open Source ETL vs Commercial ETL
Open Source ETL vs Commercial ETLOpen Source ETL vs Commercial ETL
Open Source ETL vs Commercial ETL
 
ETL Process
ETL ProcessETL Process
ETL Process
 
Intro to Delta Lake
Intro to Delta LakeIntro to Delta Lake
Intro to Delta Lake
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
 
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
Data Warehouse Tutorial For Beginners | Data Warehouse Concepts | Data Wareho...
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Ppt
PptPpt
Ppt
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
ELT vs. ETL - How they’re different and why it matters
ELT vs. ETL - How they’re different and why it mattersELT vs. ETL - How they’re different and why it matters
ELT vs. ETL - How they’re different and why it matters
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
ETL Process
ETL ProcessETL Process
ETL Process
 
What is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data WharehouseWhat is ETL testing & how to enforce it in Data Wharehouse
What is ETL testing & how to enforce it in Data Wharehouse
 

Viewers also liked

Build tool
Build toolBuild tool
Build tool
Mallikarjuna G D
 
Ci
CiCi
Ejb
EjbEjb
Visual basics
Visual basicsVisual basics
Visual basics
Mallikarjuna G D
 
Jdbc
JdbcJdbc
Maven
MavenMaven
Web services engine
Web services engineWeb services engine
Web services engine
Mallikarjuna G D
 
Ide benchmarking
Ide benchmarkingIde benchmarking
Ide benchmarking
Mallikarjuna G D
 
Project excursion career_orientation
Project excursion career_orientationProject excursion career_orientation
Project excursion career_orientation
Mallikarjuna G D
 
Training
TrainingTraining
Digital marketing
Digital marketingDigital marketing
Digital marketing
Mallikarjuna G D
 

Viewers also liked (11)

Build tool
Build toolBuild tool
Build tool
 
Ci
CiCi
Ci
 
Ejb
EjbEjb
Ejb
 
Visual basics
Visual basicsVisual basics
Visual basics
 
Jdbc
JdbcJdbc
Jdbc
 
Maven
MavenMaven
Maven
 
Web services engine
Web services engineWeb services engine
Web services engine
 
Ide benchmarking
Ide benchmarkingIde benchmarking
Ide benchmarking
 
Project excursion career_orientation
Project excursion career_orientationProject excursion career_orientation
Project excursion career_orientation
 
Training
TrainingTraining
Training
 
Digital marketing
Digital marketingDigital marketing
Digital marketing
 

Similar to ETL

Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business Intelligence
David Portnoy
 
Big data analytics beyond beer and diapers
Big data analytics   beyond beer and diapersBig data analytics   beyond beer and diapers
Big data analytics beyond beer and diapers
Kai Zhao
 
Product Analysis Oracle BI Applications Introduction
Product Analysis Oracle BI Applications IntroductionProduct Analysis Oracle BI Applications Introduction
Product Analysis Oracle BI Applications Introduction
AcevedoApps
 
Gowthami_Resume
Gowthami_ResumeGowthami_Resume
Gowthami_Resume
Gowthami Subramaniam
 
BAKKIYA_4YR
BAKKIYA_4YRBAKKIYA_4YR
Resume
ResumeResume
Resume
rajeswari p
 
VenkatSubbaReddy_Resume
VenkatSubbaReddy_ResumeVenkatSubbaReddy_Resume
VenkatSubbaReddy_Resume
Venkata SubbaReddy
 
A Comparitive Study Of ETL Tools
A Comparitive Study Of ETL ToolsA Comparitive Study Of ETL Tools
A Comparitive Study Of ETL Tools
Rhonda Cetnar
 
10 Best Big Data Management Tools
10 Best Big Data Management Tools10 Best Big Data Management Tools
10 Best Big Data Management Tools
PromptCloud
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Rittman Analytics
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
Jane Roberts
 
Pentaho Suite Analysis
Pentaho Suite Analysis Pentaho Suite Analysis
Pentaho Suite Analysis
Kymberly Grayson-Perry
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White Paper
Impetus Technologies
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
Hortonworks
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
Denodo
 
Informatica Interview Questions & Answers
Informatica Interview Questions & AnswersInformatica Interview Questions & Answers
Informatica Interview Questions & Answers
ZaranTech LLC
 
AtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White PapaerAtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White Papaer
JEAN-MICHEL LETENNIER
 
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
Eric Javier Espino Man
 
Tableau Suite Analysis
Tableau Suite Analysis Tableau Suite Analysis
Tableau Suite Analysis
Kymberly Grayson-Perry
 
Richa_Profile
Richa_ProfileRicha_Profile
Richa_Profile
Richa Sharma
 

Similar to ETL (20)

Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business Intelligence
 
Big data analytics beyond beer and diapers
Big data analytics   beyond beer and diapersBig data analytics   beyond beer and diapers
Big data analytics beyond beer and diapers
 
Product Analysis Oracle BI Applications Introduction
Product Analysis Oracle BI Applications IntroductionProduct Analysis Oracle BI Applications Introduction
Product Analysis Oracle BI Applications Introduction
 
Gowthami_Resume
Gowthami_ResumeGowthami_Resume
Gowthami_Resume
 
BAKKIYA_4YR
BAKKIYA_4YRBAKKIYA_4YR
BAKKIYA_4YR
 
Resume
ResumeResume
Resume
 
VenkatSubbaReddy_Resume
VenkatSubbaReddy_ResumeVenkatSubbaReddy_Resume
VenkatSubbaReddy_Resume
 
A Comparitive Study Of ETL Tools
A Comparitive Study Of ETL ToolsA Comparitive Study Of ETL Tools
A Comparitive Study Of ETL Tools
 
10 Best Big Data Management Tools
10 Best Big Data Management Tools10 Best Big Data Management Tools
10 Best Big Data Management Tools
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRobertsWP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
WP_Impetus_2016_Guide_to_Modernize_Your_Enterprise_Data_Warehouse_JRoberts
 
Pentaho Suite Analysis
Pentaho Suite Analysis Pentaho Suite Analysis
Pentaho Suite Analysis
 
Building a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White PaperBuilding a Big Data Analytics Platform- Impetus White Paper
Building a Big Data Analytics Platform- Impetus White Paper
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Informatica Interview Questions & Answers
Informatica Interview Questions & AnswersInformatica Interview Questions & Answers
Informatica Interview Questions & Answers
 
AtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White PapaerAtomicDBCoreTech_White Papaer
AtomicDBCoreTech_White Papaer
 
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...White paper   making an-operational_data_store_(ods)_the_center_of_your_data_...
White paper making an-operational_data_store_(ods)_the_center_of_your_data_...
 
Tableau Suite Analysis
Tableau Suite Analysis Tableau Suite Analysis
Tableau Suite Analysis
 
Richa_Profile
Richa_ProfileRicha_Profile
Richa_Profile
 

More from Mallikarjuna G D

Reactjs
ReactjsReactjs
Bootstrap 5 ppt
Bootstrap 5 pptBootstrap 5 ppt
Bootstrap 5 ppt
Mallikarjuna G D
 
CSS
CSSCSS
Angular 2.0
Angular  2.0Angular  2.0
Angular 2.0
Mallikarjuna G D
 
Spring andspringboot training
Spring andspringboot trainingSpring andspringboot training
Spring andspringboot training
Mallikarjuna G D
 
Hibernate
HibernateHibernate
Hibernate
Mallikarjuna G D
 
Jspprogramming
JspprogrammingJspprogramming
Jspprogramming
Mallikarjuna G D
 
Servlet programming
Servlet programmingServlet programming
Servlet programming
Mallikarjuna G D
 
Servlet programming
Servlet programmingServlet programming
Servlet programming
Mallikarjuna G D
 
Mmg logistics edu-final
Mmg  logistics edu-finalMmg  logistics edu-final
Mmg logistics edu-final
Mallikarjuna G D
 
Interview preparation net_asp_csharp
Interview preparation net_asp_csharpInterview preparation net_asp_csharp
Interview preparation net_asp_csharp
Mallikarjuna G D
 
Interview preparation devops
Interview preparation devopsInterview preparation devops
Interview preparation devops
Mallikarjuna G D
 
Interview preparation testing
Interview preparation testingInterview preparation testing
Interview preparation testing
Mallikarjuna G D
 
Interview preparation data_science
Interview preparation data_scienceInterview preparation data_science
Interview preparation data_science
Mallikarjuna G D
 
Interview preparation full_stack_java
Interview preparation full_stack_javaInterview preparation full_stack_java
Interview preparation full_stack_java
Mallikarjuna G D
 
Enterprunership
EnterprunershipEnterprunership
Enterprunership
Mallikarjuna G D
 
Core java
Core javaCore java
Core java
Mallikarjuna G D
 
Type script
Type scriptType script
Type script
Mallikarjuna G D
 
Angularj2.0
Angularj2.0Angularj2.0
Angularj2.0
Mallikarjuna G D
 
Git Overview
Git OverviewGit Overview
Git Overview
Mallikarjuna G D
 

More from Mallikarjuna G D (20)

Reactjs
ReactjsReactjs
Reactjs
 
Bootstrap 5 ppt
Bootstrap 5 pptBootstrap 5 ppt
Bootstrap 5 ppt
 
CSS
CSSCSS
CSS
 
Angular 2.0
Angular  2.0Angular  2.0
Angular 2.0
 
Spring andspringboot training
Spring andspringboot trainingSpring andspringboot training
Spring andspringboot training
 
Hibernate
HibernateHibernate
Hibernate
 
Jspprogramming
JspprogrammingJspprogramming
Jspprogramming
 
Servlet programming
Servlet programmingServlet programming
Servlet programming
 
Servlet programming
Servlet programmingServlet programming
Servlet programming
 
Mmg logistics edu-final
Mmg  logistics edu-finalMmg  logistics edu-final
Mmg logistics edu-final
 
Interview preparation net_asp_csharp
Interview preparation net_asp_csharpInterview preparation net_asp_csharp
Interview preparation net_asp_csharp
 
Interview preparation devops
Interview preparation devopsInterview preparation devops
Interview preparation devops
 
Interview preparation testing
Interview preparation testingInterview preparation testing
Interview preparation testing
 
Interview preparation data_science
Interview preparation data_scienceInterview preparation data_science
Interview preparation data_science
 
Interview preparation full_stack_java
Interview preparation full_stack_javaInterview preparation full_stack_java
Interview preparation full_stack_java
 
Enterprunership
EnterprunershipEnterprunership
Enterprunership
 
Core java
Core javaCore java
Core java
 
Type script
Type scriptType script
Type script
 
Angularj2.0
Angularj2.0Angularj2.0
Angularj2.0
 
Git Overview
Git OverviewGit Overview
Git Overview
 

Recently uploaded

View Inheritance in Odoo 17 - Odoo 17 Slides
View Inheritance in Odoo 17 - Odoo 17  SlidesView Inheritance in Odoo 17 - Odoo 17  Slides
View Inheritance in Odoo 17 - Odoo 17 Slides
Celine George
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptx
heathfieldcps1
 
Edukasyong Pantahanan at Pangkabuhayan 1: Personal Hygiene
Edukasyong Pantahanan at  Pangkabuhayan 1: Personal HygieneEdukasyong Pantahanan at  Pangkabuhayan 1: Personal Hygiene
Edukasyong Pantahanan at Pangkabuhayan 1: Personal Hygiene
MJDuyan
 
modul ajar kelas x bahasa inggris 24/254
modul ajar kelas x bahasa inggris 24/254modul ajar kelas x bahasa inggris 24/254
modul ajar kelas x bahasa inggris 24/254
NurFitriah45
 
Power of Ignored Skills: Change the Way You Think and Decide by Manoj Tripathi
Power of Ignored Skills: Change the Way You Think and Decide by Manoj TripathiPower of Ignored Skills: Change the Way You Think and Decide by Manoj Tripathi
Power of Ignored Skills: Change the Way You Think and Decide by Manoj Tripathi
Pankaj523992
 
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH LỚP 12 - GLOBAL SUCCESS - FORM MỚI 2025 - ...
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH LỚP 12 - GLOBAL SUCCESS - FORM MỚI 2025 - ...BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH LỚP 12 - GLOBAL SUCCESS - FORM MỚI 2025 - ...
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH LỚP 12 - GLOBAL SUCCESS - FORM MỚI 2025 - ...
Nguyen Thanh Tu Collection
 
Allopathic M1 Srudent Orientation Powerpoint
Allopathic M1 Srudent Orientation PowerpointAllopathic M1 Srudent Orientation Powerpoint
Allopathic M1 Srudent Orientation Powerpoint
Julie Sarpy
 
formative Evaluation By Dr.Kshirsagar R.V
formative Evaluation By Dr.Kshirsagar R.Vformative Evaluation By Dr.Kshirsagar R.V
formative Evaluation By Dr.Kshirsagar R.V
DrRavindrakshirsagar1
 
SEQUNCES Lecture_Notes_Unit4_chapter11_sequence
SEQUNCES  Lecture_Notes_Unit4_chapter11_sequenceSEQUNCES  Lecture_Notes_Unit4_chapter11_sequence
SEQUNCES Lecture_Notes_Unit4_chapter11_sequence
Murugan Solaiyappan
 
Imagination in Computer Science Research
Imagination in Computer Science ResearchImagination in Computer Science Research
Imagination in Computer Science Research
Abhik Roychoudhury
 
Genetics Teaching Plan: Dr.Kshirsagar R.V.
Genetics Teaching Plan: Dr.Kshirsagar R.V.Genetics Teaching Plan: Dr.Kshirsagar R.V.
Genetics Teaching Plan: Dr.Kshirsagar R.V.
DrRavindrakshirsagar1
 
How to Manage Line Discount in Odoo 17 POS
How to Manage Line Discount in Odoo 17 POSHow to Manage Line Discount in Odoo 17 POS
How to Manage Line Discount in Odoo 17 POS
Celine George
 
NAEYC Code of Ethical Conduct Resource Book
NAEYC Code of Ethical Conduct Resource BookNAEYC Code of Ethical Conduct Resource Book
NAEYC Code of Ethical Conduct Resource Book
lakitawilson
 
Bài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.doc
Bài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.docBài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.doc
Bài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.doc
PhngThLmHnh
 
How to Manage Early Receipt Printing in Odoo 17 POS
How to Manage Early Receipt Printing in Odoo 17 POSHow to Manage Early Receipt Printing in Odoo 17 POS
How to Manage Early Receipt Printing in Odoo 17 POS
Celine George
 
Benchmarking Sustainability: Neurosciences and AI Tech Research in Macau - Ke...
Benchmarking Sustainability: Neurosciences and AI Tech Research in Macau - Ke...Benchmarking Sustainability: Neurosciences and AI Tech Research in Macau - Ke...
Benchmarking Sustainability: Neurosciences and AI Tech Research in Macau - Ke...
Alvaro Barbosa
 
Odoo 17 Events - Attendees List Scanning
Odoo 17 Events - Attendees List ScanningOdoo 17 Events - Attendees List Scanning
Odoo 17 Events - Attendees List Scanning
Celine George
 
How to Add a Filter in the Odoo 17 - Odoo 17 Slides
How to Add a Filter in the Odoo 17 - Odoo 17 SlidesHow to Add a Filter in the Odoo 17 - Odoo 17 Slides
How to Add a Filter in the Odoo 17 - Odoo 17 Slides
Celine George
 
How to Create & Publish a Blog in Odoo 17 Website
How to Create & Publish a Blog in Odoo 17 WebsiteHow to Create & Publish a Blog in Odoo 17 Website
How to Create & Publish a Blog in Odoo 17 Website
Celine George
 
How To Create a Transient Model in Odoo 17
How To Create a Transient Model in Odoo 17How To Create a Transient Model in Odoo 17
How To Create a Transient Model in Odoo 17
Celine George
 

Recently uploaded (20)

View Inheritance in Odoo 17 - Odoo 17 Slides
View Inheritance in Odoo 17 - Odoo 17  SlidesView Inheritance in Odoo 17 - Odoo 17  Slides
View Inheritance in Odoo 17 - Odoo 17 Slides
 
The basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptxThe basics of sentences session 10pptx.pptx
The basics of sentences session 10pptx.pptx
 
Edukasyong Pantahanan at Pangkabuhayan 1: Personal Hygiene
Edukasyong Pantahanan at  Pangkabuhayan 1: Personal HygieneEdukasyong Pantahanan at  Pangkabuhayan 1: Personal Hygiene
Edukasyong Pantahanan at Pangkabuhayan 1: Personal Hygiene
 
modul ajar kelas x bahasa inggris 24/254
modul ajar kelas x bahasa inggris 24/254modul ajar kelas x bahasa inggris 24/254
modul ajar kelas x bahasa inggris 24/254
 
Power of Ignored Skills: Change the Way You Think and Decide by Manoj Tripathi
Power of Ignored Skills: Change the Way You Think and Decide by Manoj TripathiPower of Ignored Skills: Change the Way You Think and Decide by Manoj Tripathi
Power of Ignored Skills: Change the Way You Think and Decide by Manoj Tripathi
 
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH LỚP 12 - GLOBAL SUCCESS - FORM MỚI 2025 - ...
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH LỚP 12 - GLOBAL SUCCESS - FORM MỚI 2025 - ...BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH LỚP 12 - GLOBAL SUCCESS - FORM MỚI 2025 - ...
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH LỚP 12 - GLOBAL SUCCESS - FORM MỚI 2025 - ...
 
Allopathic M1 Srudent Orientation Powerpoint
Allopathic M1 Srudent Orientation PowerpointAllopathic M1 Srudent Orientation Powerpoint
Allopathic M1 Srudent Orientation Powerpoint
 
formative Evaluation By Dr.Kshirsagar R.V
formative Evaluation By Dr.Kshirsagar R.Vformative Evaluation By Dr.Kshirsagar R.V
formative Evaluation By Dr.Kshirsagar R.V
 
SEQUNCES Lecture_Notes_Unit4_chapter11_sequence
SEQUNCES  Lecture_Notes_Unit4_chapter11_sequenceSEQUNCES  Lecture_Notes_Unit4_chapter11_sequence
SEQUNCES Lecture_Notes_Unit4_chapter11_sequence
 
Imagination in Computer Science Research
Imagination in Computer Science ResearchImagination in Computer Science Research
Imagination in Computer Science Research
 
Genetics Teaching Plan: Dr.Kshirsagar R.V.
Genetics Teaching Plan: Dr.Kshirsagar R.V.Genetics Teaching Plan: Dr.Kshirsagar R.V.
Genetics Teaching Plan: Dr.Kshirsagar R.V.
 
How to Manage Line Discount in Odoo 17 POS
How to Manage Line Discount in Odoo 17 POSHow to Manage Line Discount in Odoo 17 POS
How to Manage Line Discount in Odoo 17 POS
 
NAEYC Code of Ethical Conduct Resource Book
NAEYC Code of Ethical Conduct Resource BookNAEYC Code of Ethical Conduct Resource Book
NAEYC Code of Ethical Conduct Resource Book
 
Bài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.doc
Bài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.docBài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.doc
Bài tập bộ trợ anh 7 I learn smart world kì 1 năm học 2022 2023 unit 1.doc
 
How to Manage Early Receipt Printing in Odoo 17 POS
How to Manage Early Receipt Printing in Odoo 17 POSHow to Manage Early Receipt Printing in Odoo 17 POS
How to Manage Early Receipt Printing in Odoo 17 POS
 
Benchmarking Sustainability: Neurosciences and AI Tech Research in Macau - Ke...
Benchmarking Sustainability: Neurosciences and AI Tech Research in Macau - Ke...Benchmarking Sustainability: Neurosciences and AI Tech Research in Macau - Ke...
Benchmarking Sustainability: Neurosciences and AI Tech Research in Macau - Ke...
 
Odoo 17 Events - Attendees List Scanning
Odoo 17 Events - Attendees List ScanningOdoo 17 Events - Attendees List Scanning
Odoo 17 Events - Attendees List Scanning
 
How to Add a Filter in the Odoo 17 - Odoo 17 Slides
How to Add a Filter in the Odoo 17 - Odoo 17 SlidesHow to Add a Filter in the Odoo 17 - Odoo 17 Slides
How to Add a Filter in the Odoo 17 - Odoo 17 Slides
 
How to Create & Publish a Blog in Odoo 17 Website
How to Create & Publish a Blog in Odoo 17 WebsiteHow to Create & Publish a Blog in Odoo 17 Website
How to Create & Publish a Blog in Odoo 17 Website
 
How To Create a Transient Model in Odoo 17
How To Create a Transient Model in Odoo 17How To Create a Transient Model in Odoo 17
How To Create a Transient Model in Odoo 17
 

ETL

  • 2. 21 June 2017 www.snipe.co.in 2 Prepared :Snipe Team
  • 3. Agenda Table of Contents Introduction What do ETL tools do? Why use an ETL tool? ETL tools Comparison Conclusion
  • 4. Overview What do ETLs tool do? An ETL tool is a tool that: Extracts data from various data sources (usually legacy) Transforms data from -> being optimized for transaction to -> being optimized for reporting and analysis synchronizes the data coming from different databases data cleanses to remove errors Loads data into a data warehouse
  • 5. Overview Why use an ETL tool? ETL tools save time and money when developing a data warehouse by removing the need for “hand-coding”. “Hand Coding” is still the most common way of integrating data today. It requires hours and hours of development and expertise to create a Business-Intelligence-System. It is very difficult for data base administrators to connect between different brands of databases without using an external tool. In the event that databases are altered or new databases need to be integrated, a lot of “hand-coded” work needs to be completely redone.
  • 7. Tools ETL Tools Pentaho Kettle Pentaho is a commercial open-source BI suite that has a product called Kettle for data integration. It uses an innovative meta-driven approach and has a strong and very easy-to-use GUI The company started around 2001 It has a strong community of 13,500 registered users It uses a stand-alone java engine that process the tasks for moving data between many different databases and files
  • 8. Tools ETL Tools Talend Talend is an open-source data integration tool It uses a code-generating approach and uses a GUI (implemented in Eclipse RC) It started around October 2006 It has a much smaller community then Pentaho, but is supported by 2 finance companies It generates Java code or Perl code which can later be run on a server
  • 9. Tools ETL Tools Informatica PowerCenter Informatica has a very good commercial data integration suite It was founded in 1993 It is the market share leader in data integration (Gartner Dataquest) It has 2600 customers. Of those, there are fortune 100 companies, companies listed on the Dow Jones and government organization The company's sole focus is data integration It has quite a big package for enterprises to integrate their systems, cleanse their data and can connect to a vast number of current and legacy systems
  • 10. Open Source Tools ETL Tools Inaplex Inaport Inaplex is a small UK company InaPlex is a producer of Customer Data Integration products for mid- market CRM solutions Inaplex mainly focuses on providing simple solutions for it’s customers to integrate their data into CRM and accounting software like Sage and Goldmine
  • 12. Type IBM (Information Server Infosphere platform) Advantages: strongest vision on the market, flexibility progress towards common metadata platform high level of satisfaction from clients and a variety of initiatives Disadvantages: difficult learning curve long implementation cycles became very heavy (lots of GBs) with version 8.x and requires a lot of processing power
  • 13. Type Informatica PowerCenter Advantages: most substantial size and resources on the market of data integration tools vendors consistent track record, solid technology, straightforward learning curve, ability to address real-time data integration schemes Informatica is highly specialized in ETL and Data Integration and focuses on those topics, not on BI as a whole focus on B2B data exchange Disadvantages: several partnerships diminishing the value of technologies limited experience in the field.
  • 14. Type Microsoft (SQL Server Integration Services) Advantages: broad documentation and support, best practices to data warehouses ease and speed of implementation standardized data integration real-time, message-based capabilities relatively low cost - excellent support and distribution model Disadvantages: problems in non-Windows environments. Takes over all Microsoft Windows limitations. unclear vision and strategy
  • 15. Type Oracle (OWB and ODI) Advantages: based on Oracle Warehouse Builder and Oracle Data Integrator – two very powerful tools; tight connection to all Oracle datawarehousing applications; tendency to integrate all tools into one application and one environment. Disadvantages: focus on ETL solutions, rather than in an open context of data management; tools are used mostly for batch-oriented work, transformation rather than real-time processes or federation data delivery; long-awaited bond between OWB and ODI brought only promises - customers confused in the functionality area and the future is uncertain
  • 16. Type SAP BusinessObjects (Data Integrator / Data Services) Advantages: integration with SAP SAP Business Objects created a firm company determined to stir the market; Good data modeling and data-management support; SAP Business Objects provides tools for data mining and quality; profiling due to many acquisitions of other companies. Quick learning curve and ease of use Disadvantages: SAP Business Objects is seen as two different companies Uncertain future. Controversy over deciding which method of delivering data integration to use (SAP BW or BODI). BusinessObjects Data Integrator (Data Services) may not be seen as a
  • 17. Types SAS Advantages: experienced company, great support and most of all very powerful data integration tool with lots of multi-management features can work on many operating systems and gather data through number of sources – very flexible great support for the business-class companies as well for those medium and minor ones Disadvantages: misplaced sales force, company is not well recognized SAS has to extend influences to reach non-BI community Costly
  • 18. Types Sun Microsystems Advantages: Data integration tools are a part of huge Java Composite Application Platform Suite - very flexible with ongoing development of the products 'Single-view' services draw together data from variety of sources; small set of vendors with a strong vision Disadvantages: relative weakness in bulk data movement limited mindshare in the market support and services rated below adequate
  • 19. Types Sybase Advantages: assembled a range of capabilities to be able to address a mulitude of data delivery styles size and global presence of Sybase create opportunities in the market pragmatic near-term strategy - better of current market demand broad partnerships with other data quality and data integration tools vendors Disadvantages: falls behind market leaders and large vendors gaps in many aspects of data management
  • 20. Types Syncsort Advantages: functionality; well-known brand on the market (40 years experience); loyalimplementation, strong performance, targeted functionality and lower costs customer and experience base; easy Disadvantages: struggle with gaining mind share in the market lack of support for other than ETL delivery styles unsatisfactory with lack of capability of professional services
  • 21. Types Tibco Software Advantages: message-oriented application integration; capabilities based on common SOA structures; support for federated views; easy implementation, support andperformance Disadvantages: scarce references from customers; not widely enough recognised for data integration competencies lacking in data quality capabilities.
  • 22. Comparison Pentaho Kettle vs Talend Pentaho Pentaho is a commerical open-source BI suite that has a product called Kettle for data integration. It uses an innovative meta-driven approach and has a strong and very easy-to-use GUI. The company started around 2001 (2002 was when kettle was integrated into it). It has a strong community of 13,500 registered users. It has a stand-alone java engine that process the jobs and tasks for moving data between many different databases and files. It can schedule tasks (but you need a schedular for that - cron). It can run remote jobs on "slave servers" on other machines. It has data quality features: from its own GUI, writing more customised SQL queries, Javascript and regular expressions.
  • 23. Conclusion Conclusion Informatica and Pentaho have very good products. Informatica has a far more extensive range of products, but compared to Pentaho is very expensive. Pentaho has proved that it can handle small to large scale systems. Pentaho is gaining fast momentum with businesses that would not have considered using open source products before.