Decision Ready Data:
Power Your Analytics with
Great Data
Murthy Mathiprakasam
2
3
Repeatably deliver
trusted and timely data
for great analytics and
great social impact
Your Mission
Great Data Powers Great Analytics and Great Impact
CIOs See Competitive Advantage in Analytics
BUT More than half
of analytics projects
fail
Gartner
Analytics is the top
priority for CIOs
again in 2015
Gartner
CIO Investment Priority
New Data Sources Enable New Insights
6
In the era of Big Interaction Data, unprecedented
insights are available from analyzing new data sources
SENSORS METERS LOGS
BADGES WEARABLES MOBILE
Cloud
New Data Platforms
Big DataTraditional /
Real Time
New Visualization Tools
New Platforms Enable New Analytical Capabilities
But Data Is Fragmented As Data Sources Proliferate
8
Data
Warehouse
Transactional
Applications
CRM ERP HR FIN
Big
Data
Unstructured
Semi-Structured
Real-time
Events
Mainframe
Systems
Cloud, Social,
Partner Data
Enterprise
Applications
It’s Difficult to Trust All Of The Available Data
9
John Smith
11710 Plaza Drive
Reston, VA ______
(___)-___-____
Incomplete
DATASETS THAT
ARE NOT ACCURATE
Jonathan Smith
John Smith
John H Smith
Inconsistent
DATASETS THAT
ARE NOT STANDARDIZED
Insecure
DATASETS THAT
ARE NOT MASKED
jsmith@yahoo.com
703-844-1212
TAYwRG@zcqee.Qew
194-366-5858
vs
Data Has Not Kept Up With The Pace of Business
10
Up to 80%
ANALYST TIME SPENT ON
DATA PREPARATION
Untimely Delivery
OF DATA INHIBITS
AGILE, REAL-TIME DECISIONS
Analytics Is Costly & Complex To Deliver Today
11
Can’t Re-Use
EXISTING SKILLS
WHEN PLATFORMS
CHANGE
Can’t Re-Use
EXISTING PROCESSES
TO DRIVE SCALABILITY
AND REPEATABILITY
As A Result, Decisions Suffer and People Suffer Impact
12
Can’t make comprehensive decisions
based on all of the available data
Can’t make accurate decisions based on
high quality and secure data
Can’t make timely decisions based on
fresh and up-to-date data
Can’t operationalize data delivery to
fuel decisions repeatably and scalably
What Is Needed: Faster, Better, Less Costly Analytics
Insights That
Are Timely
Data That
Can Be
Trusted
Simple,
Standardized,
Scalable
Delivery
14
Repeatably deliver
trusted and timely data
for great analytics and
great social impact
Your Mission
Imagine If You Could Put More Data To Use
15
Word, Excel
PDF
StarOffice
Email, LDAP
Oracle
DB2
SQL Server
Sybase
Informix
Teradata
Netezza
ODBC/JDBC
Flat files
HTTP/HTML
RPG
ANSI
AST
FIX
SWIFT
MVR
SAP NetWeaver
SAP NetWeaver BI
SAS
Siebel
JD Edwards
Lotus Notes
Oracle E-Business
PeopleSoft
EDI–X12
EDI-Fact
RosettaNet
HL7/HIPAA
XML
LegalXML
IFX
cXML
Salesforce
RightNow
NetSuite
Oracle OnDemand
Facebook
Twitter
LinkedIn
Datasift
ebXML
HL7 v3.0
ACORD
100+
PRE-BUILT PARSERS
200+
PRE-BUILT CONNECTORS
Out of the Box
BUSINESS RULES AND
DATA STANDARDIZATION
Sample of Compatible Data Types and Sources
Imagine If You Could Stream Data At High Speeds
REFINE
INGEST
Profile, Parse, Cleanse, Match
Stream
STORE
ACT
Complex Event Processing
NoSQL Databases
LAN/WAN SCALE
STREAMING COLLECTION
CENTRALIZED MANAGEMENT
OF DISTRIBUTED COLLECTION
REAL-TIME INGESTION OF
REAL-TIME DATA FOR
REAL-TIME RESPONSE
SOURCE
Transactional
Data
Interaction
Data
Sensor/Device
Data
Documents/
Files
Industry
Formats
Imagine If You Could Easily Adopt New Platforms
17
400%
FASTER MIGRATION TO
NEW PLATFORMS
500%
FASTER PERFORMANCE ON
DATA PREPARATION
0%
REWRITING OF DATA
PREPARATION JOBS
18
Imagine If You Could Easily Adopt Hadoop
REFINE WAREHOUSE
Profile, Parse, Cleanse, Match
Offload infrequently
used data for
active archiving
Offload ETL/ELT for
efficient processing
Reuse existing skills
and processes
Imagine If You Could Develop And Staff More Quickly
19
Hadoop
Developers
Informatica
Developers
100,000+
TRAINED DEVELOPERS
WORLDWIDE
500%
MORE PRODUCTIVE
THAN HAND-CODING
0%
RISK OF REWRITING
OUTDATED CODE
20
Imagine If You Could Understand Your Data
“Contact
Bill.Harison@gmail.com
for more information
about #AAPL and
#GOOG”
Person: William Harrison
Company: Apple, Inc
Company: Google
EXTRACT ENTITIES
WITH NATURAL
LANGUAGE PROCESSING
ENRICH DATASETS
WITH ADDRESS VALIDATION
AND GEOCODING
MATCH AND STANDARDIZE
FOR DATA QUALITY
AND DATA MASTERING
21
Imagine If You Could Protect Your Data
PHI: Protected Health Information
PII: Personally Identifiable Information
Scalable to look for/discover ANY Domain type
ANALYZE STRUCTURE
OF DATA WITH BUILT-IN
DATA PROFILING
ISOLATE BAD DATA QUICKLY
WITH PROFILING STATISTICS
UNDERSTAND MEANING
AND CONTEXT OF DATA
IDENTITY SENSITIVE DATA
WITH DATA DOMAIN REPORTS
Imagine If You Could Provide Virtual Access…
CANONICAL DATA MODEL
PRODUCT …CUSTOMER ORDER
SOURCE
VIRTUALIZE
ANALYZE ACCESS & MERGE
DATASETS USING
VIRTUAL TABLES
PREPARE IN REAL-TIME
WITH COMMON METADATA
REPOSITORY
Transactional
Data
Interaction
Data
Sensor/Device
Data
Documents/
Files
Industry
Formats
Business
Intelligence
Agile
Visualization
…Or Broker Physical Provisioning Automatically
SOURCE
Transactional
Data
Interaction
Data
Sensor/Device
Data
Documents/
Files
Industry
Formats
BROKER
ANALYZE
Business
Intelligence
Agile
Visualization
LOOSER COUPLING
BETWEEN SOURCE SYSTEMS
AND DESTINATION SYSTEMS
FASTER PROVISIONING
OF DATA TO DISTRIBUTED
CONSUMERSIntegration
Hub
Informatica Delivers Great Data For Any Initiative
Access Any
Data / Any
Volume
Faster Data
Onboarding
• Integrate
• Load
• Transform
• Cleanse
• Master
Offload Data
and
Processing
Batch
Deliver More
Trusted Data
Data
Warehouse
• Prepare
• Analyze
• Profile
• Cleanse
Offload to High
Performance
Storage
Realtime
Storage
X
25
#1) Define The Mission Of Your Journey
26
Identify
The
Data
Select
The
Consumers
Establish
The
Goal
#2) Deploy Leverage At Every Step
27
Leverage
Existing
Centers Of
Excellence
Leverage
Lightweight
Standards &
Processes
Leverage
Technology
for Scale and
Repeatability
#3) Deliver With Partners For Maximum Success
28
…
Strong Partner Ecosystem
• FREE 2 Hour Workshop
• DW Optimization
Assessment
• Readiness Assessment
Proven Methodology
Data Integration Data Quality
Cloud Data Integration Data Archiving
Market Leading Platform
29
Great
Transportation
 Aspiration: Florida
Turnpike sought to
improve emergency
preparedness and improve
the prepaid toll program
 Challenge: Data
collection took over one
month leading to faulty
analytics
 Outcome: “Timely and
accurate traffic, revenue,
and participation reports
help management make
good choices that will
eventually result in saving
money.”
 — Bob Hartmann, IT
Director, Florida Turnpike
Enterprise
30
Great
Environment
 Aspiration: US Geological
Service sought to improve
the quality of water in the
United States
 Challenge: Collect
distributed data and build
a centralized water quality
dataset
 Outcome: “We chose
Informatica as our data
integration solution
because of its maturity,
wide range of features,
ease of use and industrial
strength, integrated
architecture.”
 — Harry House, Data
Warehouse Practice
Leader, USGS
31
Great
Education
 Aspiration: Rochester
Institute of Technology
sought to understand how it
could improve student
enrollment, student housing,
and student retention
 Challenge: Data was in
disparate systems
 Outcome: “We're becoming
myth busters. Informatica
provides timely, accurate
information we need to spot
trends, improve the quality of
our academic learning, and
reduce attrition.”
 — Kim Sowers, Director of
Application Development,
Rochester Institute of
Technology
32
Great
Healthcare
 Aspiration: Utah Dept of
Health sought to process
healthcare claims faster
and improve public health
 Challenge: Manual effort
to track and link claims
data over time
 Outcome: “We see the
Informatica as absolutely
essential to everything that
we want to do, not only to
meet our mandate for the
All Payer Database,”
 — Dr. Keely Cofrin Allen,
Director, Office of Health
Care Statistics, State of
Utah Department of Health
33
Repeatably deliver
trusted and timely data
for great analytics and
great social impact
Your Mission
Across nearly any data, any data platform, any data visualization
Informatica Delivers Great Data for Great Social Impact
Cloud Big DataTraditional /
Modern
Data Sources
Data Visualization
Data Platforms
ThankYou

Decision Ready Data: Power Your Analytics with Great Data

  • 2.
    Decision Ready Data: PowerYour Analytics with Great Data Murthy Mathiprakasam 2
  • 3.
    3 Repeatably deliver trusted andtimely data for great analytics and great social impact Your Mission
  • 4.
    Great Data PowersGreat Analytics and Great Impact
  • 5.
    CIOs See CompetitiveAdvantage in Analytics BUT More than half of analytics projects fail Gartner Analytics is the top priority for CIOs again in 2015 Gartner CIO Investment Priority
  • 6.
    New Data SourcesEnable New Insights 6 In the era of Big Interaction Data, unprecedented insights are available from analyzing new data sources SENSORS METERS LOGS BADGES WEARABLES MOBILE
  • 7.
    Cloud New Data Platforms BigDataTraditional / Real Time New Visualization Tools New Platforms Enable New Analytical Capabilities
  • 8.
    But Data IsFragmented As Data Sources Proliferate 8 Data Warehouse Transactional Applications CRM ERP HR FIN Big Data Unstructured Semi-Structured Real-time Events Mainframe Systems Cloud, Social, Partner Data Enterprise Applications
  • 9.
    It’s Difficult toTrust All Of The Available Data 9 John Smith 11710 Plaza Drive Reston, VA ______ (___)-___-____ Incomplete DATASETS THAT ARE NOT ACCURATE Jonathan Smith John Smith John H Smith Inconsistent DATASETS THAT ARE NOT STANDARDIZED Insecure DATASETS THAT ARE NOT MASKED jsmith@yahoo.com 703-844-1212 TAYwRG@zcqee.Qew 194-366-5858 vs
  • 10.
    Data Has NotKept Up With The Pace of Business 10 Up to 80% ANALYST TIME SPENT ON DATA PREPARATION Untimely Delivery OF DATA INHIBITS AGILE, REAL-TIME DECISIONS
  • 11.
    Analytics Is Costly& Complex To Deliver Today 11 Can’t Re-Use EXISTING SKILLS WHEN PLATFORMS CHANGE Can’t Re-Use EXISTING PROCESSES TO DRIVE SCALABILITY AND REPEATABILITY
  • 12.
    As A Result,Decisions Suffer and People Suffer Impact 12 Can’t make comprehensive decisions based on all of the available data Can’t make accurate decisions based on high quality and secure data Can’t make timely decisions based on fresh and up-to-date data Can’t operationalize data delivery to fuel decisions repeatably and scalably
  • 13.
    What Is Needed:Faster, Better, Less Costly Analytics Insights That Are Timely Data That Can Be Trusted Simple, Standardized, Scalable Delivery
  • 14.
    14 Repeatably deliver trusted andtimely data for great analytics and great social impact Your Mission
  • 15.
    Imagine If YouCould Put More Data To Use 15 Word, Excel PDF StarOffice Email, LDAP Oracle DB2 SQL Server Sybase Informix Teradata Netezza ODBC/JDBC Flat files HTTP/HTML RPG ANSI AST FIX SWIFT MVR SAP NetWeaver SAP NetWeaver BI SAS Siebel JD Edwards Lotus Notes Oracle E-Business PeopleSoft EDI–X12 EDI-Fact RosettaNet HL7/HIPAA XML LegalXML IFX cXML Salesforce RightNow NetSuite Oracle OnDemand Facebook Twitter LinkedIn Datasift ebXML HL7 v3.0 ACORD 100+ PRE-BUILT PARSERS 200+ PRE-BUILT CONNECTORS Out of the Box BUSINESS RULES AND DATA STANDARDIZATION Sample of Compatible Data Types and Sources
  • 16.
    Imagine If YouCould Stream Data At High Speeds REFINE INGEST Profile, Parse, Cleanse, Match Stream STORE ACT Complex Event Processing NoSQL Databases LAN/WAN SCALE STREAMING COLLECTION CENTRALIZED MANAGEMENT OF DISTRIBUTED COLLECTION REAL-TIME INGESTION OF REAL-TIME DATA FOR REAL-TIME RESPONSE SOURCE Transactional Data Interaction Data Sensor/Device Data Documents/ Files Industry Formats
  • 17.
    Imagine If YouCould Easily Adopt New Platforms 17 400% FASTER MIGRATION TO NEW PLATFORMS 500% FASTER PERFORMANCE ON DATA PREPARATION 0% REWRITING OF DATA PREPARATION JOBS
  • 18.
    18 Imagine If YouCould Easily Adopt Hadoop REFINE WAREHOUSE Profile, Parse, Cleanse, Match Offload infrequently used data for active archiving Offload ETL/ELT for efficient processing Reuse existing skills and processes
  • 19.
    Imagine If YouCould Develop And Staff More Quickly 19 Hadoop Developers Informatica Developers 100,000+ TRAINED DEVELOPERS WORLDWIDE 500% MORE PRODUCTIVE THAN HAND-CODING 0% RISK OF REWRITING OUTDATED CODE
  • 20.
    20 Imagine If YouCould Understand Your Data “Contact Bill.Harison@gmail.com for more information about #AAPL and #GOOG” Person: William Harrison Company: Apple, Inc Company: Google EXTRACT ENTITIES WITH NATURAL LANGUAGE PROCESSING ENRICH DATASETS WITH ADDRESS VALIDATION AND GEOCODING MATCH AND STANDARDIZE FOR DATA QUALITY AND DATA MASTERING
  • 21.
    21 Imagine If YouCould Protect Your Data PHI: Protected Health Information PII: Personally Identifiable Information Scalable to look for/discover ANY Domain type ANALYZE STRUCTURE OF DATA WITH BUILT-IN DATA PROFILING ISOLATE BAD DATA QUICKLY WITH PROFILING STATISTICS UNDERSTAND MEANING AND CONTEXT OF DATA IDENTITY SENSITIVE DATA WITH DATA DOMAIN REPORTS
  • 22.
    Imagine If YouCould Provide Virtual Access… CANONICAL DATA MODEL PRODUCT …CUSTOMER ORDER SOURCE VIRTUALIZE ANALYZE ACCESS & MERGE DATASETS USING VIRTUAL TABLES PREPARE IN REAL-TIME WITH COMMON METADATA REPOSITORY Transactional Data Interaction Data Sensor/Device Data Documents/ Files Industry Formats Business Intelligence Agile Visualization
  • 23.
    …Or Broker PhysicalProvisioning Automatically SOURCE Transactional Data Interaction Data Sensor/Device Data Documents/ Files Industry Formats BROKER ANALYZE Business Intelligence Agile Visualization LOOSER COUPLING BETWEEN SOURCE SYSTEMS AND DESTINATION SYSTEMS FASTER PROVISIONING OF DATA TO DISTRIBUTED CONSUMERSIntegration Hub
  • 24.
    Informatica Delivers GreatData For Any Initiative Access Any Data / Any Volume Faster Data Onboarding • Integrate • Load • Transform • Cleanse • Master Offload Data and Processing Batch Deliver More Trusted Data Data Warehouse • Prepare • Analyze • Profile • Cleanse Offload to High Performance Storage Realtime Storage X
  • 25.
  • 26.
    #1) Define TheMission Of Your Journey 26 Identify The Data Select The Consumers Establish The Goal
  • 27.
    #2) Deploy LeverageAt Every Step 27 Leverage Existing Centers Of Excellence Leverage Lightweight Standards & Processes Leverage Technology for Scale and Repeatability
  • 28.
    #3) Deliver WithPartners For Maximum Success 28 … Strong Partner Ecosystem • FREE 2 Hour Workshop • DW Optimization Assessment • Readiness Assessment Proven Methodology Data Integration Data Quality Cloud Data Integration Data Archiving Market Leading Platform
  • 29.
    29 Great Transportation  Aspiration: Florida Turnpikesought to improve emergency preparedness and improve the prepaid toll program  Challenge: Data collection took over one month leading to faulty analytics  Outcome: “Timely and accurate traffic, revenue, and participation reports help management make good choices that will eventually result in saving money.”  — Bob Hartmann, IT Director, Florida Turnpike Enterprise
  • 30.
    30 Great Environment  Aspiration: USGeological Service sought to improve the quality of water in the United States  Challenge: Collect distributed data and build a centralized water quality dataset  Outcome: “We chose Informatica as our data integration solution because of its maturity, wide range of features, ease of use and industrial strength, integrated architecture.”  — Harry House, Data Warehouse Practice Leader, USGS
  • 31.
    31 Great Education  Aspiration: Rochester Instituteof Technology sought to understand how it could improve student enrollment, student housing, and student retention  Challenge: Data was in disparate systems  Outcome: “We're becoming myth busters. Informatica provides timely, accurate information we need to spot trends, improve the quality of our academic learning, and reduce attrition.”  — Kim Sowers, Director of Application Development, Rochester Institute of Technology
  • 32.
    32 Great Healthcare  Aspiration: UtahDept of Health sought to process healthcare claims faster and improve public health  Challenge: Manual effort to track and link claims data over time  Outcome: “We see the Informatica as absolutely essential to everything that we want to do, not only to meet our mandate for the All Payer Database,”  — Dr. Keely Cofrin Allen, Director, Office of Health Care Statistics, State of Utah Department of Health
  • 33.
    33 Repeatably deliver trusted andtimely data for great analytics and great social impact Your Mission
  • 34.
    Across nearly anydata, any data platform, any data visualization Informatica Delivers Great Data for Great Social Impact Cloud Big DataTraditional / Modern Data Sources Data Visualization Data Platforms
  • 35.