This document summarizes a presentation on open data and information production processes in healthcare. It discusses that (1) open data is not always possible due to privacy and requires understanding how data is created, (2) information production involves data generation, manipulation, and use which can be opened at various levels, and (3) open data supports transparency, evidence-based practices, and open science by countering closed systems and politics.
1. Translated and updated from the original presentation at the seminar of
“Openness and the Future of Healthcare IS” by Service Factory,
Aalto University School of Business at 18th of March 2014.
OPENDATA-CriticalCapabilityinHealthcare
InformationProductionProcesses
Sami Laine, Doctoral Student
Department of Computer Science and Engineering, Aalto University
Email: sami.k.laine@aalto.fi
2. University of Turku,
Finland
•Information systems
•Empirical field
studies in hospital
focusing on the use
of IT.
Turku University
Hospital, Finland
•Healthcare
datawarehousing
•Project management,
system and service
design.
Aalto University, Finland
•Usability Research
•Healthcare data
quality research across
contexts.
Personal background combines technical, social and
healthcare perspectives
Over 10 years involvement in healthcare sector
3. Idealistic View – Open Data is Completely Free
Completely Free - Availability and Access
Completely Free - Reuse and Redistribution
Completely Free - Universal Participation
http://okfn.org/ http://fi.okfn.org/
This is not Enough nor always Possible
4. Benchmarking claimed significant productivity
differences in neurology specialty
Pirkanmaa
Hospital District
Hospital District
of Southwest
Finland
Laine, S., Niemi, E. (2013), “Transparency of Hospital Productivity Benchmarking in Two Finnish Hospital Districts”, In the
Proceedings of the 29th annual Patient Classification Systems International (PCSI) Conference, Helsinki, Finland.
5. For example, fragmentation bias rewards splitting
and heterogeneity
Hospital
District
Hospital
District
DRG X
Episode A
DRG X
DRG A DRG B DRG C
Less
production
but more
health for
same
money!
More
production
but less
health for
same
money!
Laine, S., Niemi, E. (2013), “Transparency of Hospital Productivity Benchmarking in Two Finnish Hospital Districts”, In the
Proceedings of the 29th annual Patient Classification Systems International (PCSI) Conference, Helsinki, Finland.
6. Enters data
for primary
purpose
Builds data sets
for secondary
use
Analyses and
reports data
Interprets data
and makes
decisions
Medical Imaging
System
Electronic Patient
Record
Scripts
collect
data
Scripts
collect
data
Internal Service
Reports
Scripts
produce
internal
reports
Finnish Hospital
Productivity
Benchmarking
Scripts
produce data
external sets
DATA SUPPLY DATA MANUFACTURING DATA CONSUMPTION
IPP has three problem themes: Human Errors,
Software Features and Obscurity
Data
Warehouse
Scripts
produce
external
reports
Data
Warehouse
Data Entry Errors
Application Feature
Bias
Architecture
Bias
Scripting Error
Interpretation
Mismatch
Laine, S., Niemi, E. (2013), “Transparency of Hospital Productivity Benchmarking in Two Finnish Hospital Districts”, In the
Proceedings of the 29th annual Patient Classification Systems International (PCSI) Conference, Helsinki, Finland.
7. Enters data
for primary
purpose
Builds data sets
for secondary
use
Analyses and
reports data
Interprets data
and makes
decisions
Medical Imaging
System
Electronic Patient
Record
Scripts
collect
data
Scripts
collect
data
Internal Service
Reports
Scripts
produce
internal
reports
Finnish Hospital
Productivity
Benchmarking
Scripts
produce data
external sets
DATA SUPPLY DATA MANUFACTURING DATA CONSUMPTION
IPP has three problem themes: Human Errors,
Software Features and Obscurity
Data
Warehouse
Scripts
produce
external
reports
Data
Warehouse
Data Entry Errors
Application Feature
Bias
Architecture
Bias
Scripting Error
Scripting Error
Obscurity Obscurity
Obscurity
Obscurity
Laine, S., Niemi, E. (2013), “Transparency of Hospital Productivity Benchmarking in Two Finnish Hospital Districts”, In the
Proceedings of the 29th annual Patient Classification Systems International (PCSI) Conference, Helsinki, Finland.
8. Inconsistent figures have been produced about the
same issue at the same time…
Ambulatory
Procedures in
Administrative
Reports
Ambulatory
Procedures in
Operation Room
Reports
Which
one is
correct?
Laine, S. (2012), "APC-SIMULATOR: Demonstrating the Effects of Technical and Semantic Errors in the Accuracy of Hospital
Reporting" In the Proceedings of the 17th International Conference on Information Quality (ICIQ), Paris, France.
9. Inaccuracies and semantic mismatches exist in
healthcare data
“Patient bills”
“Manually
duplicated codes”
“Planned
procedures”
1568710726
Both are more
accurate than
expected but only for
a specific purpose.
”Actually Performed
Ambulatory Procedures”
- X % + Y %
There exists semantic
mismatches and error
rates between contexts
for good reasons.
Laine, S. (2012), "APC-SIMULATOR: Demonstrating the Effects of Technical and Semantic Errors in the Accuracy of Hospital
Reporting" In the Proceedings of the 17th International Conference on Information Quality (ICIQ), Paris, France.
10. You cannot trust Open Data unless you know exactly
where data comes from and how it is actually
created!
All Ambulatory
Procedures
Planned or Billed
Patient Bills or
Municipality Invoices?
Detailed level in actual socio-technical reality!
Processes and Work
Practices
User Interfaces and Data
Entry Protocols
Data Models and
Application Structures
One must also describe the Information Production Process behind
the interface to avoid black boxes
11. Enters data for
primary purpose
Builds data sets for
secondary use
Analyses and
reports data
Interprets data and
makes decisions for
secondary purposes
Medical Imaging
System
Electronic Patient
Record
Scripts
collect data
Scripts collect
data
Datawarehouse
Monthly Service
Reports
Scripts produce internal
reports
National Hospital
Benchmarking
Scripts produce
data external sets
DATA SUPPLY DATA MANUFACTURING DATA CONSUMPTION
Information Production Process (IPP) consists of three
phases based on Total Quality Management
Wang, R. Y., Lee, Y. W., Pipino, L. L., Strong, D. M. (1998) “Manage Your Information as a Product”, Sloan Management Review, 39,
4, pp. 95-105.
National Registry
12. Medical Imaging
System
Electronic
Patient Record
Scripts
collect
data
Scripts
collect
data
Management
Reports
Scripts produce
internal reports
Research Results
DATA SUPPLY DATA MANUFACTURING DATA CONSUMPTION
Open Data means also Open Methods and Open
Processes
Local Data
Warehouse
National Hospital
Benchmarking
Scripts
produce
external data
sets
National
Data
Registry
Scripts produce
public reports
DATA METHODS OPEN DATA
Scripts
produce
external data
sets
Scientific
Data Set
Scripts produce
external
analysis
PROCESS
Information Production Process of Researchers
Information Production Process of Health Service Providers
Information Production Process of National Organizations
The problem is
often the obscurity
of actual data
creation situations
Another problem is
earlier information
production
processes
13. Medical Imaging
System
Electronic
Patient Record
Scripts
collect
data
Scripts
collect
data
Management
Reports
Scripts produce
internal reports
Research Results
DATA SUPPLY DATA MANUFACTURING DATA CONSUMPTION
IPP capable for openness can be secured or opened
at any phase
Local Data
Warehouse
National Hospital
Benchmarking
Scripts
produce
external data
sets
National
Data
Registry
Scripts produce
public reports
PRIVATE OPENNESS
Scripts
produce
external data
sets
Scientific
Data Set
Scripts produce
external
analysis
Information Production Process of Researchers
Information Production Process of Health Service Providers
Information Production Process of National Organizations
What aggregation
level can be opened
to confidential or
even public use?
“Truly Ideologically
Open Data”
CONFIDENTIAL OPEN
14. Medical Imaging
System
Electronic
Patient Record
Scripts
collect
data
Scripts
collect
data
Management
Reports
Scripts produce
internal reports
Research Results
DATA SUPPLY DATA MANUFACTURING DATA CONSUMPTION
Interfaces, Processes and Methods can be always
made open
Local Data
Warehouse
National Hospital
Benchmarking
Scripts
produce
external data
sets
National
Data
Registry
Scripts produce
public reports
DATA MODELS!
APPLICATION
INTERFACES!
TOOLS!
SCRIPTS! ALL OPEN!
Scripts
produce
external data
sets
Scientific
Data Set
Scripts produce
external
analysis
PROCESSES
WORKFLOWS
Information Production Process of Researchers
Information Production Process of Health Service Providers
Information Production Process of National Organizations
These should be
OPEN at least for
INTERNAL USE!
15. Medical Imaging
System
Electronic
Patient Record
Scripts
collect
data
Scripts
collect
data
Management
Reports
Scripts produce
internal reports
Research Results
DATA SUPPLY DATA MANUFACTURING DATA CONSUMPTION
Open Data supports Evidence-Based Management,
Open Government and Open Science
Local Data
Warehouse
National Hospital
Benchmarking
Scripts
produce
external data
sets
National
Data
Registry
Scripts produce
public reports
DATA METHODS OPEN X!
Scripts
produce
external data
sets
Scientific
Data Set
Scripts produce
external
analysis
PROCESSES
OPEN SCIENCE
EVIDENCE-BASED MANAGEMENT
OPEN GOVERNMENT
16. Medical Imaging
System
Electronic
Patient Record
Scripts
collect
data
Scripts
collect
data
Management
Reports
Scripts produce
internal reports
Research Results
DATA SUPPLY DATA MANUFACTURING DATA CONSUMPTION
Closed Data leads to traditional habits and beliefs,
insider power politics and flawed science.
Local Data
Warehouse
National Hospital
Benchmarking
Scripts
produce
external data
sets
National
Data
Registry
Scripts produce
public reports
DATA METHODS
Closed &
Obscure
Scripts
produce
external data
sets
Scientific
Data Set
Scripts produce
external
analysis
PROCESSES
OBSCURITY
OBSCURITY
OBSCURITY
Flawed Science
Habits and Beliefs
Insider Politics
17. Distributing Data
Typically Open Data
Product Data
Organization Data
Service Data
Administrative Data
Customer Instructions
Potential but challenging
Open Data
Distributing Sensitive
Data by using External
Authorization
Distributing Sensitive
Data in Aggregated Form
18. Acquiring Data
3/20/2014Laitoksen nimi 18
Typically Open Data
Geographical data
Aggregated
demographical data
Aggregated statistics
News and social media
feeds
Reference data
Potential but challenging
Open Data
Full-scale master data (i.e.
customers, services etc)
External service
transactions (i.e. Health
services, transportation
services etc)
Sensitive individual level
demographical data
19. All Information Production Processes should have
Open Data capability
Laitoksen nimi
Open Data should not be
”ideological product” but
”practical capability” that is
utilized everywhere
Open Data as capability means
Open Data, Open Methods and
Open Processes
For internal use
For external users
However, The level of
Openness of Data Sets is
controlled according their
content requirements.
Data Sets
Data Models
Application
Interfaces
Tools
Workflows
Documents
Open Data
Capability
All processes
Every workflow
20. Prevents Vendor-lock-in
• Open standards and Application Interfaces
• Open collaboration between stakeholders and use cases
Improves Transparency
• To own healthcare service production
• To own information management and software systems
Supports Quality Control
• Own internal activity (patient services or medical device maintenance)
• Service providers (e.g. software or logistics)
• External Benchmarking (e.g. international productivity benchmarking)
Open Information Production brings Transparency
and Interoperability to Software Systems and
Healthcare
21. Application for Strategic Research Opening at 11.6.2012
Project time schedule 1.9.2012-31.8.2014
QUALIDAT
QUALITYOFDATAFOR
VALID DECISIONS
SupportingDiverseUsesandUsers
Nieminen, Marko (prof.)
Rossi, Matti (prof.)
Borgman, Jukka
Kaipio, Johanna
Laine, Sami
Mahlamäki, Katrine
Niemi, Erkkahttp://qualidat.aalto.fi/
22. Enters data for
primary purpose
Builds data sets for
secondary use
Analyses and
reports data
Interprets data and
makes decisions for
secondary purposes
Big Data
(e.g. medical device)
Social Data
(e.g. Facebook)
Operative Data
(e.g. hospital ERP) Scripts
collect manually
entered data
Application
inspects machine
generated event
data
Analytical
Datawarehouse
Statistical
Reports
Innovative algorithms
produce analyses
QUALIDAT RESEARCH = Tracking down the entire information flow
Open Data
Scripts construct data
sets for external use
DATA GENERATION DATA MANIPULATION DATA UTILIZATION
http://qualidat.aalto.fi/
23. The Researched Domains in the Information Production
Process
USER INTERFACE
”Screens” ”Interactions”
DATA FLOW
”Script logic”
DATA UTILIZATIONDATA MANIPULATIONDATA GENERATION
WORK PRACTICES
”Processes”,”Guidelines”
APPLICATION LOGIC
”Data model” ”Interfaces”
DATA MODEL
”Models” ”Descriptions”
ANALYTICS
”Business rules” ”Aggregation logic”
MEASUREMENT
”Method”
BUSINESS FUNCTION
”Domain terminology”
BUSINESS CASE
”Situation” ”Conclusions”
Enters data for
primary purpose
Builds data sets for
secondary use
Analyses and
reports data
Interprets data and
makes decisions for
secondary purposes
http://qualidat.aalto.fi/
24. Scientific publications about open data and open
science
Nosek, B. A., & Bar-Anan, Y. (2012), Scientific Utopia: I.
Opening scientific communication. Psychological
Inquiry, 23, pp. 217–243.
Nosek, B. A., Spies, J. R. and Motyl, M. (2012),
Scientific Utopia: II. Restructuring Incentives and
Practices to Promote Truth Over Publishability.
Perspectives on Psychological Science, 7(6), pp. 615-
631.
25. Some webpages about Openness
Open Science
National Research Data Initiative (TTA)
Open activism
Open Knowledge Finland ry
Openness of ICT
The Roadmap for Open ICT Ecosystems, Berkman
Center for Internet & Society at Harvard Law School