Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Informatica + Hadoop = Best of Both Worlds
1. Why Informatica Instead of
Hand-Coding for Big Data
Ahmed Tayeh Key Account Manager
TelecommunicationsHealth Care InsuranceFinance Public Sector
2. 2
Should You Hand-Code or Use Informatica
Informatica + Open Source = Best of Both Worlds
TARGET
SYSTEMS
INGEST
SOURCE
DATA
Servers &
Mainframe
Databases,
Files
Replicate
Stream
Archive
Batch
Sensor
data
Social
Prepare Refine Govern
Backend DBs
Mobile Apps
Analytics & Op
Dashboards
Analytics Teams
DELIVER
Services
Events
Topics
Batch
MDM
EDW
Best of Informatica
20 years DI innovations
Best of Open Source
scalable distributed computing
3. 3
Should You Hand-Code or Use Informatica
Informatica + Open Source = Best of Both Worlds
TARGET
SYSTEMS
INGEST
SOURCE
DATA
Servers &
Mainframe
Databases,
Files
Replicate
Stream
Archive
Batch
Sensor
data
Social
Prepare Refine Govern
Backend DBs
Mobile Apps
Analytics & Op
Dashboards
Analytics Teams
DELIVER
Services
Events
Topics
Batch
MDM
EDW
Best of Informatica
20 years DI innovations
Data
intensive
batch ETL
Map
Reduce
Spark &
Tez
End to
end data
lineage
Real-time
compute
intensive
ETL
Best of Open Source
scalable distributed computing
Navigator
& Falcon
YARN
Zero
footprint
data
ingestion
Sqoop
Native
INFA apps
on
Hadoop
Leverage
UDF as M/
R
translator
Hive
4. Staff Projects with Readily Available Skills
Informatica Developers are Hadoop Developers
Hand-coding
A large global bank grew staff from 2 Java
developers to 100 Informatica developers after
implementing Informatica Big Data Edition
Careerbuilder.com found in a survey
there were 27,000 requests for Hadoop
skills and only 3,000 resumes with
Hadoop skills
– whereas there are over 100,000
trained Informatica developers globally.
5. Increase Developer Productivity
Informatica Developers are up to 5x more productive
4 weeks
4 days!
2X performance!
Vs.
Hadoop
Hand-coders
Informatica developers
Informatica Developers are
5x more productive based on
customer POCs
6. Informatica
Hand-Coding
Time to Deploy
Accelerate Deployment
Informatica Accelerates Big Data projects into production
Maximize Reuse
Available 24x7 Scale Performance
Flexible to ChangeEasy to Maintain &
Govern
AutomaCcally Deploy
Time to Deploy
With Informatica everything you build in the sandbox can be immediately
deployed as enterprise ready production. “Our Big Data POC was so
successful the business asked us to keep it up and running as a
production system”
• Need to refactor/rebuild code for
production requirements
• Difficult to maintain
• Inflexible to change as Hadoop evolves
and new data types onboarded
• SDLC is up to 5x longer due to
additional testing
• Little to no reuse across projects
• No data governance, cannot audit
lineage
7. Reduce Risk of Changing Technologies
Informatica provides an insurance policy as Hadoop changes
Minimize or eliminate the
need to rebuild or recode
data pipelines & quickly
adopt new innovations in
the Big Data community
Hadoop
Cloud DI Servers Data
Warehouse
Development
Deployment
9. When to use Handcoding
q One-off, simple prototypes and projects with
only 1 or 2 data sources
q You have lots of skilled Hadoop developers
and $$ to devote to data integration, quality &
analytics
q Data is easy to access and use on Hadoop
q Data quality is not a priority
q Hand coding is the corporate strategy for all
data integration
When to use Informatica
q Sophisticated projects that need to go from pilot
to production quickly
q Need development speed & operational efficiency
of visual tools for maintenance, modifications and
reuse
q Plan to scale to multiple projects & departments
q Have complex specialized industry formats or
unstructured data that needs to be parsed
q You want to use more affordable PowerCenter
developers for data integration and quality
q Data is complex to access and use (SAP,
mainframe, etc.)
q Data quality, governance & lineage are
important
q Already a PowerCenter customer and want to
leverage existing skills and prior work
Informatica and Handcoding on Hadoop Decision Guide
How to determine the best approach for your organization
10. Why Informatica Big Data Management?
10
Informatica + Hadoop = Best of Both Worlds
• Capitalize on Big Data
• Easily integrate more data faster from more data sources
• Collaborate with big data governance and quality
• Protect more data without more risk