Moving Data Science from an Event to
a Program
Wayne Applebaum, Ph. D.
What Gartner Sees
2
“… by 2017, 33 percent of Fortune 100
organizations will experience an
information crisis, due to their inability to
effectively value, govern and trust their
enterprise information.”
Gartner Press Release, February 27, 2014
How it should work
Business
processes and
Business
decisions
bracket a
robust
infrastructure of
data tools and
processes.Transactional Information/Other Data
Measures Analytics Tools
Business Decisions
Target data store
Load Quality Processes
Business Processes
How it usually works
Silo’s of
data that
are difficult
to put
together
"Those who don't know history are destined to repeat
it.”-Edmund Burke
Here’s a Data Scientist Viewpoint
• Identifying Data
Sources
• Data Correctness
• Data Quality
• Business
Involvement
• Multiple Sources
• Data Governance
• Flexibility
• Takes up 80% of
their time
Why the Problem is Getting Worse
• Use and value placed on data and is increasing
• More decisions are being made in the same
amount of time
• Answers aren’t in the silos-you need to cross the
silos to get them
• Business demand for information based decision
is not discussed in the popular media
6
Pressure and opportunity for data and analytics
is rising
Reuse is becoming a business necessity
Emergence of Business Decision Data
7
Data
Business
Decision
Data
MasterTransactional
While the basic rules of Data Governance remains the
same. the scope is expanding
Transactions Vs. Decisions
8
Transactions
Decisions
Process each transaction as quickly as possible
Consolidate Information to make the correct
decision as quickly as possible
The Data Governance-No Free
Lunch Rule
9
When it comes to integrating data sources
There is no free lunch
You have to understand the data and
context to be able to make decisions
10
Creating the Data Hub: Overview
Scope
Identifying Key Objects/Values
Creating the
Controlled
Vocabulary
Object
Mapping
Creating the
Canonical/Targ
et Model
Creating and Rules and Standards
Implementing
Data Retrieval
Creating the
User Interface
Developing
Load
Procedures
Architecture Decisions
Ingestion, Database, Data Governance. Retrieval
Where do we go from here?
11
• Implement Data Governance early
• Integrate Data Governance Across Silo’s
• Recognize that Data Governance doesn’t end with
Master Data
• Big Data represents a new challenges because the
meaning of a transaction is no longer defined on entry
• Create the governance and structures to support both
transactions and decisions
• Consider Data Hubs for cross silo integrations
Governance is essential for reuse and reuse is essential
to maximize value

Moving Data Science from an Event to A Program: Considerations in Creating Sustainable and Reusable Data Sources

  • 1.
    Moving Data Sciencefrom an Event to a Program Wayne Applebaum, Ph. D.
  • 2.
    What Gartner Sees 2 “…by 2017, 33 percent of Fortune 100 organizations will experience an information crisis, due to their inability to effectively value, govern and trust their enterprise information.” Gartner Press Release, February 27, 2014
  • 3.
    How it shouldwork Business processes and Business decisions bracket a robust infrastructure of data tools and processes.Transactional Information/Other Data Measures Analytics Tools Business Decisions Target data store Load Quality Processes Business Processes
  • 4.
    How it usuallyworks Silo’s of data that are difficult to put together "Those who don't know history are destined to repeat it.”-Edmund Burke
  • 5.
    Here’s a DataScientist Viewpoint • Identifying Data Sources • Data Correctness • Data Quality • Business Involvement • Multiple Sources • Data Governance • Flexibility • Takes up 80% of their time
  • 6.
    Why the Problemis Getting Worse • Use and value placed on data and is increasing • More decisions are being made in the same amount of time • Answers aren’t in the silos-you need to cross the silos to get them • Business demand for information based decision is not discussed in the popular media 6 Pressure and opportunity for data and analytics is rising Reuse is becoming a business necessity
  • 7.
    Emergence of BusinessDecision Data 7 Data Business Decision Data MasterTransactional While the basic rules of Data Governance remains the same. the scope is expanding
  • 8.
    Transactions Vs. Decisions 8 Transactions Decisions Processeach transaction as quickly as possible Consolidate Information to make the correct decision as quickly as possible
  • 9.
    The Data Governance-NoFree Lunch Rule 9 When it comes to integrating data sources There is no free lunch You have to understand the data and context to be able to make decisions
  • 10.
    10 Creating the DataHub: Overview Scope Identifying Key Objects/Values Creating the Controlled Vocabulary Object Mapping Creating the Canonical/Targ et Model Creating and Rules and Standards Implementing Data Retrieval Creating the User Interface Developing Load Procedures Architecture Decisions Ingestion, Database, Data Governance. Retrieval
  • 11.
    Where do wego from here? 11 • Implement Data Governance early • Integrate Data Governance Across Silo’s • Recognize that Data Governance doesn’t end with Master Data • Big Data represents a new challenges because the meaning of a transaction is no longer defined on entry • Create the governance and structures to support both transactions and decisions • Consider Data Hubs for cross silo integrations Governance is essential for reuse and reuse is essential to maximize value