This document discusses methods for facilitating data sharing and secondary use of health data while preserving privacy, including anonymization and secure multi-party computation. It provides examples of how these methods allow useful analyses like disease surveillance and anonymous record linkage to be performed without direct access to identifying information. Secure computation techniques allow computations on encrypted data, while anonymization aims to prevent re-identification of data through techniques like de-identification. Critical factors in applying these methods include managing risks, engaging relevant stakeholders, and protecting intellectual property.
1. Deploying SMC in Practice
Khaled El Emam
Electronic Health Information Laboratory & uOttawa
EXAMPLES IN HEALTHCARE SETTINGS
2. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Researchers’ Need for Data
• Digitization, performance-based funding, greater
inter-operability and fiscal pressures make more
data available for research
• Linked data allows analyses to span more of the
continuum of care and look at social
determinants of health
• Severe competition for research funding means
there is an urgency to providing access to data
to support proposals, funding, and delivery of
results
3. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Benefits of Sharing Research Data
• Confirmation of published results
• Availability for meta-analyses
• Feedback to improve data quality
• Cost savings from not collecting the data
again
• Minimize need for participants to provide
data repeatedly
• Data for instruction and education
1
4. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Benefits of Sharing Data: Commercial
• Software testing
• Targeted marketing campaigns
• Post marketing surveillance
• Monetization of data
• Information product development
• Internal analytics (models for decision
support)
• Device diagnostics
1
5. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Regulatory Framework
• Legislation and regulations cover personally-
identifiable health data
• When not mandated or permitted, use and
disclosure of health data for secondary
purposes requires either consent or
anonymization in accordance with
regulations
1
6. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Secondary Purposes
• Secondary purposes means non-
direct care uses of personal health
information including:
– Research
– Public health
– Quality/safety measurement
– Payment
– Provider certification or accreditation
– Marketing
– Other commercial activities
1
Safran C, Bloomrosen M, Hammond E, Labkoff S, S K-F, Tang P, Detmer D. Toward a national framework for the secondary use of
health data: An American Medical Informatics Association white paper. Journal of the American Medical Informatics Association,
2007; 14:1-9.
7. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Types of Data Flows
1
8. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Data Flows
• Uses by an agent/affiliate for secondary
purposes (e.g., financial analysis, human
resources planning)
• Mandatory disclosures (e.g., communicable
diseases, gunshot wounds)
• Permitted discretionary disclosures for
secondary purposes (e.g., public health and
research)
• Other disclosures for secondary purposes
(e.g., marketing)
1
9. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Facilitating Disclosure
• Privacy and confidentiality concerns have
made many health organizations very
reluctant to share data and to take
advantage of large scale analytics on the
cloud. Three factors contribute:
• Regulations that limit disclosure of
personal information
• Legitimate concerns
about potential data leaks
• Compelled disclosures
10. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Methods to Facilitate Disclosure
• Two options now exist for disclosing
personal information for complex analytics to
avoid being the “creepy guy in the room”
– Anonymization
– Secure multi-party
computation
11. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
ANONYMIZATION
12. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
De-identification
13. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
PARAT
Providing organizations with a scalable solution to automate
the anonymization of structured & unstructured data
• Measure risk of re-identification under different
attacks
• Transform data to ensure that the risk is below
a given threshold
• Configure re-identification risk threshold
settings directly from Privacy Analytics’ online
Risk Assessment application
• Determine enterprise policies for data
sharing to ensure that administrative controls
are in place to manage risk
• Automate data sharing agreements and
certifications that confirm risks are “very small”
for re-identification
14. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
PARAT Software
15. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
16. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
SECURE COMPUTATION
What is secure computation?
17. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Secure Computation
• A set of techniques (protocols) developed to
allow computations to be performed on
encrypted data – do analytics without
knowing or exposing the raw data
• Example computations:
Public health surveillance: rates, categorical data analysis
Rare adverse drug event detection using regression models
(GLM and GEE) for distributed data
Secure matching: record lookup without revealing the
record details, matching databases without revealing
matching keys
18. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
SECURE COMPUTATION
How does secure computation work?
19. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Public key
Encryption
9
20. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Randomized Public key
Encryption
10
21. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Randomized Public key
Encryption
1 2 1 2 ,If r r then c c but
1 2sk skDec c Dec c m
11
22. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Randomized Public key
Encryption
Notation denoting
an encrypted
plaintext
23. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Additively Homomorphic
Encryption
24. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Additively Homomorphic
Encryption
25. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Additively Homomorphic
Encryption
26. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
SECURE SURVEILLANCE
27. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
28. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
ARO Surveillance from Long
Term Care Homes in Ontario
Disclosure of Colonization /
Infection Rates
Not Currently Legally Required
From LTCHs in Ontario
Objective:
Compute colonization rates without
knowing the values for any single LTCH
29. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
30. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
DISEASE SURVEILLANCE
Data Aggregator
Key holder[count1] x [count2] x
[count3] x [count4] =
[count1 + count2 +
count3 + count4]
[count1]
[count2]
[count3]
[count4]
(count1 + count2 +
count3 + count4) / 4
31. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
DISEASE SURVEILLANCE
High Response Rate = 82%
33. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
DISEASE SURVEILLANCE (MRSA)
Regions
Facility
number of
beds
Central
West
North East Toronto West Central
East
Bed group
prevalence
0-60 3.31 1.57 3.17 -- 8.38 0.72 3.87
61-120 2.73 1.07 2.04 -- 7.88 1.8 3.34
121-180 3.15 0.56 2.54 0.91 7.83 1.08 2.94
180 + 2.91 -- 2.37 2.58 8.63 1.68 2.61
Regional
prevalence
3.00 0.79 2.42 1.86 8.04 1.44
34. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
ANONYMOUS LINKING
35. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Anonymous Linking
• Typical use cases:
– The best fields to link databases on are quite
sensitive: health insurance number, social
security/insurance number, medical record
number
– Organizations do not have the authority to
exchange data, but need to de-duplicate
databases or do lookups
• Anonymous linking allows the linking of
records in remote databases without sharing
any sensitive or personal information or
sharing any secrets
36. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
37. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
38. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Generation and
distribution of keys
39. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Encryption of OHIP#
using a
public key
40. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Encryption of local
OHIP# using the same
public key
41. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Perform homomorphic
equality test
on the two
encrypted values
42. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Decrypt the results of the
equality tests using the
private key
43. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Results of matches can be
used to de-duplicate, link, or
return a lookup outcome
44. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
ONLINE & OFFLINE PURCHASES
45. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
46. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Chlamidya Screening
Objective: compute screening rates and evaluate impact
of interventions to improve them
Pulling data out of EMRs (family doctors) about females
14-24 eligible for Chlamidya screening and match that
with lab data to determine how many have been
screened (match rates)
Matching on OHIP#, name, DoB
No release of personal information in the process
47. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
48. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Critical Success Factors / Risks
• Embedding within a healthcare environment
• Large multi-disciplinary teams
• Supporting software after the initial prototype
• Academic evaluation criteria
• Publishing outside the traditional computer
science community
• Managing and protecting IP
49. Electronic Health Information Laboratory, CHEO Research Institute, 401 Smyth Road, Ottawa K1H 8L1, Ontario; www.ehealthinformation.ca
Contact
kelemam@ehealthinformation.ca
@kelemam
www.ehealthinformation.ca