Rapid Data Integration
and Curation
Delivering Business Value in the First 24 Hours

SPEAKER:
Thomas Kelly, Practice Direc...
Agenda

1

2

| ©2013, Cognizant

BARRIERS TO RAPID DATA INTEGRATION

3

2

DELIVERING BUSINESS VALUE

RAPID DATA INTEGRAT...
We are at an Inflection Point at which Value is Created or
Destroyed

Source : The Motley Fool
3

| ©2013, Cognizant
Delivering Information Faster Produces Direct, Measurable
Business Value
What Difference Does One Day Make?

A blockbuster...
Barriers to Rapid Data Integration
Rework is expensive –
must “get it right” from
the start

Fit with the existing
data; a...
Evolutionary Method to Data Integration and Curation
Responsive

Data
Approach

• As new information flows into the
enterp...
Leverage Insights and Expertise, Rapidly and Sustainably
Identify and leverage
existing, relevant data
assets and expertis...
Can You Help Me With Some Data?

8

| ©2013, Cognizant
Rapid Data Integration and Curation Method

1

Define Preliminary Objectives

2

Profile the New Data

3

Generate Initial...
1. Define Preliminary Objectives

1. Discuss Functional and Timing Objectives, and
Priorities
2. Clarify Immediate, Short-...
2. Profile the New Data

Light Profiling, focusing on
Understanding Key Data Elements
Needed to Meet the First
Deliverable...
3. Generate Initial Ontology for the New Data

Reverse-engineer Ontology from
New Data

Load New Data into the RDF Store
(...
4. Generate Initial Ontology for the Existing Data (if necessary)

Map Selected Entities and Critical
Attributes for Exist...
5. Integrate Entities over Common URIs

Different URIs, Separately
Maintained

Focus on Key Entities

Equivalence Function...
6. Create URI Links
Geography

Customer

cust:ZipCode

JOIN

geo:ZipCode

Geography

Customer
cust:ZipCodeURI

LINK

The D...
7. Add Initial Data Quality Filters and Transformations

Traditional Data Warehouse
Data Quality
Happens Here
Data Quality...
8. Analyze Data and Generate Feedback

Demonstrate Visualization using
Sample Queries

Walk Through Available Data
Sets an...
Architectural Foundation for Rapid Data Integration and Curation

SPARQL-based Visualization
Relational-to-RDF Mapping
Dat...
Capabilities That We Have Introduced

Rapid Response to New Data
Onboarding Needs

Process for Evolutionary Data
Integrati...
Questions?

20 | ©2013, Cognizant
Thank you!

21 | ©2013, Cognizant
Speaker
Thomas (Tom) Kelly
Practice Director, Enterprise Information Management, Cognizant

Thomas Kelly is a Director in ...
Upcoming SlideShare
Loading in...5
×

Rapid data integration and curation

268

Published on

Organizations must onboard new data sources more frequently and quickly. In this presentation, you will learn about practices that rapidly deliver business value, while shrinking time to business value from months to days.

Business decisions are becoming increasingly dependent on analyzing an ever-greater volume of data coming from a growing number of sources. Mobile technology is providing immediate access to data whenever and wherever it is needed. Users, customers, and business partners are waiting for answers, and the organization must reduce the time required to collect, understand, and analyze the data needed to provide those answers. Modern enterprises need to increase the agility, flexibility, and speed with which they can analyze a growing volume, variety, and velocity of data.

This presentation discusses a method for rapid data integration and curation:

- Techniques for light data integration of new data with existing data assets
- Framework for data quality management
- Refining data integration through evolutionary modeling
- Managing curation processes
- Validating business value

Timely delivery of new data assets allows users to begin asking questions earlier and getting answers more quickly, allowing the organization to uncover the new insights that drive lasting business benefits.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
268
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Rapid data integration and curation

  1. 1. Rapid Data Integration and Curation Delivering Business Value in the First 24 Hours SPEAKER: Thomas Kelly, Practice Director Semantic Technology Center of Excellence Enterprise Information Management Cognizant Technology Solutions, Inc. | ©2013, Cognizant
  2. 2. Agenda 1 2 | ©2013, Cognizant BARRIERS TO RAPID DATA INTEGRATION 3 2 DELIVERING BUSINESS VALUE RAPID DATA INTEGRATION AND CURATION METHOD
  3. 3. We are at an Inflection Point at which Value is Created or Destroyed Source : The Motley Fool 3 | ©2013, Cognizant
  4. 4. Delivering Information Faster Produces Direct, Measurable Business Value What Difference Does One Day Make? A blockbuster drug generates $3M+ in revenue per day; a one-day delay in completing clinical trials can generate up to $500K in additional costs Banking A moderate-sized brokerage firm can generate up to $1M in financial services revenue per day 4 | ©2013, Cognizant
  5. 5. Barriers to Rapid Data Integration Rework is expensive – must “get it right” from the start Fit with the existing data; avoid data silos | ©2013, Cognizant Reconciling differences (data formats, coding, identifiers, etc.) Managing data quality (accuracy, precision, context) 5 Knowledge acquisition takes time; new insights come from experimentation Overcoming process inertia
  6. 6. Evolutionary Method to Data Integration and Curation Responsive Data Approach • As new information flows into the enterprise, people and processes are dynamic in nature • Questions arising during this phase are “what to do” and “how to make the best sense of the new data source”. Rapid integration tools will aid in quick prototyping and building solutions of value Rapid Integration and Curation Method • The data is profiled and explored for value and quality issues. • A rapid pruning exercise is undertaken by prototyping and integrating with in-house data to evaluate if data is fit for purpose. It influences in formulating a effective approach for further phases. Information Management Approach Time 6 | ©2013, Cognizant Managed • As we progress, issues with the new data are identified and managed. The main focus is on establishing data quality and adhering to enterprise standards and frameworks while building optimal integration approaches • The integration process is evolutionary as further discoveries are made for optimal design Evolutionary • Progressive build based on the new data. • Building awareness of the new platform and fine tuning the capabilities around the data source are primary activities Proactive • Data management evolves to a morerefined state. A feedback loop is built to enable proactive decisions around data organization and access. • Data integration is efficient and stable. Verifiable compliance and security. • Integrated with the enterprise information management framework Predictable • The services built around the new data sources are now managed. • The focus is on evolution of business processes, based on managed models Tactical Progressive Managed First 1-5 Days First 1 -3 Months After 3 months
  7. 7. Leverage Insights and Expertise, Rapidly and Sustainably Identify and leverage existing, relevant data assets and expertise Ingest new data sources (light integration and curation) Reuse Expertise Analyze Monitor and measure use and benefits achieved; identify next set of priorities Realize Benefits Extend Create and extend data relationships, leveraging insights from previous study cycles Govern Elevate proven data, relationships, and expertise to organization-wise definition 7 | ©2013, Cognizant Refine Capture insights from new data analysis cycles, refining relationships to support new analytics
  8. 8. Can You Help Me With Some Data? 8 | ©2013, Cognizant
  9. 9. Rapid Data Integration and Curation Method 1 Define Preliminary Objectives 2 Profile the New Data 3 Generate Initial Ontology for the New Data Generate Initial Ontology for the Existing Data (if necessary) 4 5 Integrate Entities over Common URIs 6 Create URI Links 7 Add Initial Data Quality Filters 8 9 | ©2013, Cognizant Analyze Data and Generate Feedback
  10. 10. 1. Define Preliminary Objectives 1. Discuss Functional and Timing Objectives, and Priorities 2. Clarify Immediate, Short-Term, and Long-Term Business Value (SMART *) a. Cost Reduction/Avoidance b. Meet Critical Customer Need 3. Is This the Right Solution? 4. Set Expectations a. Evolutionary Process b. Initial Results Quickly c. Frequent, Active Participation d. Feedback Critical to Making Refinements 5. Brainstorm Deliverables that Produce Business Benefits; Define a Few Sample Queries 6. Ask for Commitment to Benefits Realization 7. Start the Clock! * SMART -- Specific, Measurable, Attainable, Realistic, and Traceable 10 | ©2013, Cognizant
  11. 11. 2. Profile the New Data Light Profiling, focusing on Understanding Key Data Elements Needed to Meet the First Deliverable Identify Initial Data Filtering Candidates Capture Insights about Key Data Relationships 11 | ©2013, Cognizant
  12. 12. 3. Generate Initial Ontology for the New Data Reverse-engineer Ontology from New Data Load New Data into the RDF Store (or Create Link to the Data) Create Business-relevant Synonyms for High-Importance Attributes Refinements will be made in Future Iterations 12 | ©2013, Cognizant
  13. 13. 4. Generate Initial Ontology for the Existing Data (if necessary) Map Selected Entities and Critical Attributes for Existing Data Source(s) to the Source-specific Ontology Existing Data New Data 13 | ©2013, Cognizant Add Reference to the Source-specific Ontology to the New Data Ontology Refinements will be made in Future Iterations New Data Ontology manages integration with Existing Data until the ontology is sufficiently mature to be promoted into an enterprise ontology
  14. 14. 5. Integrate Entities over Common URIs Different URIs, Separately Maintained Focus on Key Entities Equivalence Functions Logically Integrate the Federated Data Reduces Query Complexity and Can Improve Query Performance 14 | ©2013, Cognizant
  15. 15. 6. Create URI Links Geography Customer cust:ZipCode JOIN geo:ZipCode Geography Customer cust:ZipCodeURI LINK The Data has Common Values that can be used in Join Operations, but Doesn’t have Links Links Reduce Query Complexity and Can Improve Query Performance Focus on Key Queries, Identify Complex or Time-Sensitive Joins Add Linking URI Attribute to Dependent Entity Amend Selected Queries to Leverage the New Link 15 | ©2013, Cognizant
  16. 16. 7. Add Initial Data Quality Filters and Transformations Traditional Data Warehouse Data Quality Happens Here Data Quality Happens Here Data Source A Data Source B Data Source C 16 | ©2013, Cognizant Existing Data ETL New Data And Data Here Warehouse JIT Data Quality Management, Everywhere that it is Needed Data Filtering and Transformation Rules are Encoded in the Ontology Focus is on Critical Data Quality Rules Rule Updates are Automatically in Effect, without Reloading All of the Data
  17. 17. 8. Analyze Data and Generate Feedback Demonstrate Visualization using Sample Queries Walk Through Available Data Sets and Data Organization Experiment with Data Access and New Visualizations Provide Next Steps Recommendations to Refine the Data Integration and Curation 17 | ©2013, Cognizant
  18. 18. Architectural Foundation for Rapid Data Integration and Curation SPARQL-based Visualization Relational-to-RDF Mapping Data Profiling 18 | ©2013, Cognizant Ontology Editor Automated Ontology Generation RDF Store Data Import RDF Store
  19. 19. Capabilities That We Have Introduced Rapid Response to New Data Onboarding Needs Process for Evolutionary Data Integration and Curation Flexible Design that is Responsive to Business Changes Foundation for Refinement and Expansion of Ontology Models from Fit-for-Purpose to Department, to Business Unit, to Enterprise 19 | ©2013, Cognizant
  20. 20. Questions? 20 | ©2013, Cognizant
  21. 21. Thank you! 21 | ©2013, Cognizant
  22. 22. Speaker Thomas (Tom) Kelly Practice Director, Enterprise Information Management, Cognizant Thomas Kelly is a Director in Cognizant’s Enterprise Information Management (EIM) Practice and heads its Semantic Technology Center of Excellence, a technology specialty of Cognizant Business Consulting (CBC). He has 20-plus years of technology consulting experience in leading data warehousing, business intelligence and big data projects, focused primarily on the life sciences and healthcare industries. Tom can be reached at Thomas.Kelly@cognizant.com. 22 | ©2013, Cognizant
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×