SlideShare a Scribd company logo
1 of 34
The Genomics Revolution: The
Good, The Bad, and The Ugly
(A Privacy Researcher's Perspective)
Emiliano De Cristofaro
University College London
https://emilianodc.com
In a nutshell
The Good
Revolution in medicine and healthcare
Genetic testing for the masses
The Bad
Collection of highly sensitive data
Very hard to anonymize / de-identify
The Ugly
Greater good vs privacy
Encryption might not be the answer
2
3
WGS Progress
History
1970s: DNA sequencing starts
1990: The “Human Genome Project” starts
2003: First human genome fully sequenced
2012: UK announces sequencing of 100K genomes
2015: USA announces sequencing of 1M genomes
$$$
$3B: Human Genome Project
$250K: Illumina (2008)
$5K: Complete Genomics (2009), Illumina (2011)
$1K: Illumina (2014) 4
How to read the
genome?
Genotyping
Testing for genetic
differences using a set
of markers
Sequencing
Determining the full
nucleotide order of an
organism’s genome
5
6
7
8
9
But… not all data are
created equal!
11
Privacy Researcher’s Perspective
Treasure trove of sensitive information
Ethnic heritage, predisposition to diseases
Genome = the ultimate identifier
Hard to anonymize / de-identify
Sensitivity is perpetual
Cannot be “revoked”
Leaking one’s genome ≈ leaking relatives’ genome
12
The Greater Good
vs
Privacy?
13
The rise of a new research community
Studying privacy issues
Exploring techniques to protect privacy
14
De-Anonymization
Melissa Gymrek et al. “Identifying Personal Genomes by
Surname Inference.” Science Vol. 339, No. 6117, 2013 15
Aggregation
Re-identification of aggregated data
Statistics from allele frequencies can be used to
identify genetic trial participants [1]
Presence of an individual in a group can be determined by
using allele frequencies and his DNA profile [2]
[1] R. Wang et al. “Learning Your Identity and Disease from Research Papers:
Information Leaks in Genome Wide Association Study.” ACM CCS, 2009
[2] N. Homer et al. Resolving individuals contributing trace amounts of DNA to highly
complex mixtures using high-density SNP genotyping microarrays. PLoS Genetics,
4, Aug. 2008 16
Kin Privacy
Quantifying how much privacy do relatives lose
when one’s genome is leaked?
M. Humbert et al., “Addressing the Concerns of the Lacks Family:
Quantification of Kin Genomic Privacy.” Proceedings of ACM CCS, 2013
Also read: “Routes for
breaching genetic privacy”
Y. Erlich and A. Narayanan,
Nature Review Genetics
Vol. 15, No. 6, 2014
18
Ethnographic Studies in WGS
Semi-structured interviews with 16 participants
Assessing perception of genetic tests, attitude toward WGS
programs, as well as perception of privacy/ethical issues
(Some) Preliminary results
1. Preferred method is through doctors not companies
(trust)
2. Labor/healthcare discrimination top concerns
3. Differences in correlation with income and education
E. De Cristofaro. “Users' Attitudes, Perception, and Concerns in the
Era of Whole Genome Sequencing.” (USEC 2014)
19
The rise of a new research community
Studying privacy issues
Exploring techniques to protect privacy
20
Differential Privacy
Computing number/location of SNPs associated to disease
Significance/correlation between a SNP and a disease
A. Johnson and V. Shmatikov. “Privacy-Preserving Data Exploration in
Genome-Wide Association Studies.” Proceedings of KDD, 2013
21
Privacy in Genome Wide Association Studies (GWAS)
Privacy-Preserving Genomic Tests
Individuals retain control of their sequenced
genome
Allow doctors/labs to run genetics tests, but:
1. Genome never disclosed, only test output is
2. Pharmas can keep test specifics confidential
… two main approaches …
22
(i)DNA
sample
(i) Clinical and
Environmental
data
(ii) Encrypted SNPs
(iii)Disease
Risk
Computation
CERTIFIED
INSTITUTION (CI)
MEDICAL
UNIT (MU)
STORAGE AND
PROCESSING UNIT (SPU)
PATIENT
(P)
1. Using Semi-Trusted Parties
23
1. Using Semi-Trusted Parties
Ayday et al. (WPES’13)
Data is encrypted and stored at a “Storage Process Unit”
Disease susceptibility testing
Ayday et al. (DPM’13)
Encrypting raw genomic data (short reads)
Allowing medical unit to privately retrieve them
Danezis and De Cristofaro (WPES’14)
Regression for disease susceptibility
24
doctor
or lab
genome
individual
test specifics
Secure
Function
Evaluation
test result test result
• Private Set Intersection (PSI)
• Authorized PSI
• Private Pattern Matching
• Homomorphic Encryption
• Garbled Circtuis
• […]
Output reveals nothing beyond
test result
• Paternity/Ancestry Testing
• Testing of SNPs/Markers
• Compatibility Testing
• […]
2. Users keep sequenced genomes
25
2. Users keep sequenced genomes
Baldi et al. (CCS’11)
Privacy-preserving version of a few genetic tests, based
on private set operations
Paternity test, Personalized Medicine, Compatibility Tests
(First work to consider fully sequenced genomes)
De Cristofaro et al. (WPES’12), extends the above
Framework and prototype deployment on Android
Adds Ancestry/Genealogy Testing
26
Private Set Intersection (PSI)
Server Client
S ={s1, ,sw} C ={c1, ,cv}
Private
Set Intersection
SÇC
27
Private Set Intersection
Cardinality-only (PSI-CA)
Server Client
S ={s1, ,sw} C ={c1, ,cv}
PSI-CA
SÇC
28
Private Set
Intersection
Cardinality
Test Result:
(#fragments with same length)
Private RFLP-based Paternity Test
29
Open Problems
Where do we store genomes?
Encryption can’t guarantee security past 30-50 yrs
Reliability and availability issues?
Cryptography
Efficiency overhead
Data representation assumptions
How much understanding required from users?
30
We all leave biological cells behind…
Hair, saliva, etc., can be collected and sequenced?
Compare this “attack” to re-identifying millions
of DNA donors or hacking into 23andme…
The former: expensive, prone to mistakes, only works
against a handful of targeted victims
The latter: very “scalable”
Why do we even care
about genome privacy?
31
Epilogue
Whole Genome Sequencing
A revolution in healthcare
Raises worrisome privacy/ethical concerns
The Genomic Privacy research community
Understanding the privacy issues
Privacy-preserving testing on whole genomes
Possible using efficient crypto protocols & cross-discipline
collaboration
A number of open research issues…
32
For more info:
http://genomeprivacy.org
Also:
E. Ayday, E. De Cristofaro, J.P. Hubaux, G. Tsudik.
“Whole Genome Sequencing: Revolutionary
Medicine or Privacy Nightmare?”
IEEE Computer Magazine
Thank you!
Special thanks to
E. Ayday, P. Baldi, R. Baronio, G. Danezis, S. Faber,
P. Gasti, J-P. Hubaux, B. Malin, G. Tsudik.

More Related Content

What's hot

Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14
mhaendel
 
What's In a Genotype?: An Ontological Characterization for the Integration of...
What's In a Genotype?: An Ontological Characterization for the Integration of...What's In a Genotype?: An Ontological Characterization for the Integration of...
What's In a Genotype?: An Ontological Characterization for the Integration of...
mhb120
 
Line of Attack, Science Feb 27
Line of Attack, Science Feb 27Line of Attack, Science Feb 27
Line of Attack, Science Feb 27
Jill Neimark
 
Evo genetic evidence_lt10
Evo genetic evidence_lt10Evo genetic evidence_lt10
Evo genetic evidence_lt10
siskiyoukid
 

What's hot (19)

Investigation of phylogenic relationships of shrew populations using genetic...
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...
 
Hospital Microbiome Project Proposal
Hospital Microbiome Project ProposalHospital Microbiome Project Proposal
Hospital Microbiome Project Proposal
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14
 
Shining a light on virus infection with high-resolution microscopy
Shining a light on virus infection with high-resolution microscopyShining a light on virus infection with high-resolution microscopy
Shining a light on virus infection with high-resolution microscopy
 
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
 
CORONAVIRUS AKA SARS PATENT
CORONAVIRUS AKA SARS PATENT CORONAVIRUS AKA SARS PATENT
CORONAVIRUS AKA SARS PATENT
 
What's In a Genotype?: An Ontological Characterization for the Integration of...
What's In a Genotype?: An Ontological Characterization for the Integration of...What's In a Genotype?: An Ontological Characterization for the Integration of...
What's In a Genotype?: An Ontological Characterization for the Integration of...
 
Seminario biología molecular
Seminario biología molecular Seminario biología molecular
Seminario biología molecular
 
485 lec2 history and review (i)
485 lec2 history and review (i)485 lec2 history and review (i)
485 lec2 history and review (i)
 
Us7279327 Methods for producing recombinant coronavirus
Us7279327 Methods for producing recombinant coronavirusUs7279327 Methods for producing recombinant coronavirus
Us7279327 Methods for producing recombinant coronavirus
 
Basic knowledge of_viral_metagenome_vanshika-varshney
Basic knowledge of_viral_metagenome_vanshika-varshneyBasic knowledge of_viral_metagenome_vanshika-varshney
Basic knowledge of_viral_metagenome_vanshika-varshney
 
Microbial Metagenomics and Human Health
Microbial Metagenomics and Human HealthMicrobial Metagenomics and Human Health
Microbial Metagenomics and Human Health
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
Line of Attack, Science Feb 27
Line of Attack, Science Feb 27Line of Attack, Science Feb 27
Line of Attack, Science Feb 27
 
Evo genetic evidence_lt10
Evo genetic evidence_lt10Evo genetic evidence_lt10
Evo genetic evidence_lt10
 
Eleanor Gallo_Hendrikx CV
Eleanor Gallo_Hendrikx CVEleanor Gallo_Hendrikx CV
Eleanor Gallo_Hendrikx CV
 
Supporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi RehmSupporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi Rehm
 
Quantified Self On Being A Personal Genomic Observatory
Quantified Self On Being A Personal Genomic ObservatoryQuantified Self On Being A Personal Genomic Observatory
Quantified Self On Being A Personal Genomic Observatory
 

Similar to The Genomics Revolution: The Good, The Bad, and The Ugly

2010StanfordE25 Michele dragoescu e25 project
2010StanfordE25 Michele dragoescu e25 project2010StanfordE25 Michele dragoescu e25 project
2010StanfordE25 Michele dragoescu e25 project
mdragoescu
 
Hyoji genetic testing essay
Hyoji genetic testing essayHyoji genetic testing essay
Hyoji genetic testing essay
Hyoji
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2
Eli Rosenthal
 
Whole genome scanning, resolving clinical diagnosis and management amaist com...
Whole genome scanning, resolving clinical diagnosis and management amaist com...Whole genome scanning, resolving clinical diagnosis and management amaist com...
Whole genome scanning, resolving clinical diagnosis and management amaist com...
koda004
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
inside-BigData.com
 

Similar to The Genomics Revolution: The Good, The Bad, and The Ugly (20)

The Genomics Revolution: The Good, The Bad, The Ugly
The Genomics Revolution: The Good, The Bad, The UglyThe Genomics Revolution: The Good, The Bad, The Ugly
The Genomics Revolution: The Good, The Bad, The Ugly
 
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
 
Crowdsourcing the Analysis of Genomes
Crowdsourcing the Analysis of GenomesCrowdsourcing the Analysis of Genomes
Crowdsourcing the Analysis of Genomes
 
2010StanfordE25 Michele dragoescu e25 project
2010StanfordE25 Michele dragoescu e25 project2010StanfordE25 Michele dragoescu e25 project
2010StanfordE25 Michele dragoescu e25 project
 
Hyoji genetic testing essay
Hyoji genetic testing essayHyoji genetic testing essay
Hyoji genetic testing essay
 
Personal Genomes: what can I do with my data?
Personal Genomes: what can I do with my data?Personal Genomes: what can I do with my data?
Personal Genomes: what can I do with my data?
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2
 
Whole genome scanning, resolving clinical diagnosis and management amaist com...
Whole genome scanning, resolving clinical diagnosis and management amaist com...Whole genome scanning, resolving clinical diagnosis and management amaist com...
Whole genome scanning, resolving clinical diagnosis and management amaist com...
 
TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
 
Methods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big dataMethods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big data
 
Grand round whsiao_may2015
Grand round whsiao_may2015Grand round whsiao_may2015
Grand round whsiao_may2015
 
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality?  - William HsiaoHow Can We Make Genomic Epidemiology a Widespread Reality?  - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13
 
Gen epio immem_griffiths
Gen epio immem_griffithsGen epio immem_griffiths
Gen epio immem_griffiths
 
IRIDA's Genomic epidemiology application ontology (GenEpiO): Genomic, clinica...
IRIDA's Genomic epidemiology application ontology (GenEpiO): Genomic, clinica...IRIDA's Genomic epidemiology application ontology (GenEpiO): Genomic, clinica...
IRIDA's Genomic epidemiology application ontology (GenEpiO): Genomic, clinica...
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
 
Bioinformatics Strategies for Exposome 100416
Bioinformatics Strategies for Exposome 100416Bioinformatics Strategies for Exposome 100416
Bioinformatics Strategies for Exposome 100416
 

Recently uploaded

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 

The Genomics Revolution: The Good, The Bad, and The Ugly

  • 1. The Genomics Revolution: The Good, The Bad, and The Ugly (A Privacy Researcher's Perspective) Emiliano De Cristofaro University College London https://emilianodc.com
  • 2. In a nutshell The Good Revolution in medicine and healthcare Genetic testing for the masses The Bad Collection of highly sensitive data Very hard to anonymize / de-identify The Ugly Greater good vs privacy Encryption might not be the answer 2
  • 3. 3
  • 4. WGS Progress History 1970s: DNA sequencing starts 1990: The “Human Genome Project” starts 2003: First human genome fully sequenced 2012: UK announces sequencing of 100K genomes 2015: USA announces sequencing of 1M genomes $$$ $3B: Human Genome Project $250K: Illumina (2008) $5K: Complete Genomics (2009), Illumina (2011) $1K: Illumina (2014) 4
  • 5. How to read the genome? Genotyping Testing for genetic differences using a set of markers Sequencing Determining the full nucleotide order of an organism’s genome 5
  • 6. 6
  • 7. 7
  • 8. 8
  • 9. 9
  • 10.
  • 11. But… not all data are created equal! 11
  • 12. Privacy Researcher’s Perspective Treasure trove of sensitive information Ethnic heritage, predisposition to diseases Genome = the ultimate identifier Hard to anonymize / de-identify Sensitivity is perpetual Cannot be “revoked” Leaking one’s genome ≈ leaking relatives’ genome 12
  • 14. The rise of a new research community Studying privacy issues Exploring techniques to protect privacy 14
  • 15. De-Anonymization Melissa Gymrek et al. “Identifying Personal Genomes by Surname Inference.” Science Vol. 339, No. 6117, 2013 15
  • 16. Aggregation Re-identification of aggregated data Statistics from allele frequencies can be used to identify genetic trial participants [1] Presence of an individual in a group can be determined by using allele frequencies and his DNA profile [2] [1] R. Wang et al. “Learning Your Identity and Disease from Research Papers: Information Leaks in Genome Wide Association Study.” ACM CCS, 2009 [2] N. Homer et al. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genetics, 4, Aug. 2008 16
  • 17. Kin Privacy Quantifying how much privacy do relatives lose when one’s genome is leaked? M. Humbert et al., “Addressing the Concerns of the Lacks Family: Quantification of Kin Genomic Privacy.” Proceedings of ACM CCS, 2013 Also read: “Routes for breaching genetic privacy” Y. Erlich and A. Narayanan, Nature Review Genetics Vol. 15, No. 6, 2014
  • 18. 18
  • 19. Ethnographic Studies in WGS Semi-structured interviews with 16 participants Assessing perception of genetic tests, attitude toward WGS programs, as well as perception of privacy/ethical issues (Some) Preliminary results 1. Preferred method is through doctors not companies (trust) 2. Labor/healthcare discrimination top concerns 3. Differences in correlation with income and education E. De Cristofaro. “Users' Attitudes, Perception, and Concerns in the Era of Whole Genome Sequencing.” (USEC 2014) 19
  • 20. The rise of a new research community Studying privacy issues Exploring techniques to protect privacy 20
  • 21. Differential Privacy Computing number/location of SNPs associated to disease Significance/correlation between a SNP and a disease A. Johnson and V. Shmatikov. “Privacy-Preserving Data Exploration in Genome-Wide Association Studies.” Proceedings of KDD, 2013 21 Privacy in Genome Wide Association Studies (GWAS)
  • 22. Privacy-Preserving Genomic Tests Individuals retain control of their sequenced genome Allow doctors/labs to run genetics tests, but: 1. Genome never disclosed, only test output is 2. Pharmas can keep test specifics confidential … two main approaches … 22
  • 23. (i)DNA sample (i) Clinical and Environmental data (ii) Encrypted SNPs (iii)Disease Risk Computation CERTIFIED INSTITUTION (CI) MEDICAL UNIT (MU) STORAGE AND PROCESSING UNIT (SPU) PATIENT (P) 1. Using Semi-Trusted Parties 23
  • 24. 1. Using Semi-Trusted Parties Ayday et al. (WPES’13) Data is encrypted and stored at a “Storage Process Unit” Disease susceptibility testing Ayday et al. (DPM’13) Encrypting raw genomic data (short reads) Allowing medical unit to privately retrieve them Danezis and De Cristofaro (WPES’14) Regression for disease susceptibility 24
  • 25. doctor or lab genome individual test specifics Secure Function Evaluation test result test result • Private Set Intersection (PSI) • Authorized PSI • Private Pattern Matching • Homomorphic Encryption • Garbled Circtuis • […] Output reveals nothing beyond test result • Paternity/Ancestry Testing • Testing of SNPs/Markers • Compatibility Testing • […] 2. Users keep sequenced genomes 25
  • 26. 2. Users keep sequenced genomes Baldi et al. (CCS’11) Privacy-preserving version of a few genetic tests, based on private set operations Paternity test, Personalized Medicine, Compatibility Tests (First work to consider fully sequenced genomes) De Cristofaro et al. (WPES’12), extends the above Framework and prototype deployment on Android Adds Ancestry/Genealogy Testing 26
  • 27. Private Set Intersection (PSI) Server Client S ={s1, ,sw} C ={c1, ,cv} Private Set Intersection SÇC 27
  • 28. Private Set Intersection Cardinality-only (PSI-CA) Server Client S ={s1, ,sw} C ={c1, ,cv} PSI-CA SÇC 28
  • 29. Private Set Intersection Cardinality Test Result: (#fragments with same length) Private RFLP-based Paternity Test 29
  • 30. Open Problems Where do we store genomes? Encryption can’t guarantee security past 30-50 yrs Reliability and availability issues? Cryptography Efficiency overhead Data representation assumptions How much understanding required from users? 30
  • 31. We all leave biological cells behind… Hair, saliva, etc., can be collected and sequenced? Compare this “attack” to re-identifying millions of DNA donors or hacking into 23andme… The former: expensive, prone to mistakes, only works against a handful of targeted victims The latter: very “scalable” Why do we even care about genome privacy? 31
  • 32. Epilogue Whole Genome Sequencing A revolution in healthcare Raises worrisome privacy/ethical concerns The Genomic Privacy research community Understanding the privacy issues Privacy-preserving testing on whole genomes Possible using efficient crypto protocols & cross-discipline collaboration A number of open research issues… 32
  • 33. For more info: http://genomeprivacy.org Also: E. Ayday, E. De Cristofaro, J.P. Hubaux, G. Tsudik. “Whole Genome Sequencing: Revolutionary Medicine or Privacy Nightmare?” IEEE Computer Magazine
  • 34. Thank you! Special thanks to E. Ayday, P. Baldi, R. Baronio, G. Danezis, S. Faber, P. Gasti, J-P. Hubaux, B. Malin, G. Tsudik.