SlideShare a Scribd company logo
1 of 34
Download to read offline
The Genomics Revolution: The
Good, The Bad, and The Ugly
(A Privacy Researcher's Perspective)
Emiliano De Cristofaro
University College London
https://emilianodc.com
In a nutshell
The Good
Revolution in medicine and healthcare
Genetic testing for the masses
The Bad
Collection of highly sensitive data
Very hard to anonymize / de-identify
The Ugly
Greater good vs privacy
Encryption might not be the answer
2
3
WGS Progress
History
1970s: DNA sequencing starts
1990: The “Human Genome Project” starts
2003: First human genome fully sequenced
2012: UK announces sequencing of 100K genomes
2015: USA announces sequencing of 1M genomes
$$$
$3B: Human Genome Project
$250K: Illumina (2008)
$5K: Complete Genomics (2009), Illumina (2011)
$1K: Illumina (2014) 4
How to read the
genome?
Genotyping
Testing for genetic
differences using a set
of markers
Sequencing
Determining the full
nucleotide order of an
organism’s genome
5
6
7
8
9
But… not all data are
created equal!
11
Privacy Researcher’s Perspective
Treasure trove of sensitive information
Ethnic heritage, predisposition to diseases
Genome = the ultimate identifier
Hard to anonymize / de-identify
Sensitivity is perpetual
Cannot be “revoked”
Leaking one’s genome ≈ leaking relatives’ genome
12
The Greater Good
vs
Privacy?
13
The rise of a new research community
Studying privacy issues
Exploring techniques to protect privacy
14
De-Anonymization
Melissa Gymrek et al. “Identifying Personal Genomes by
Surname Inference.” Science Vol. 339, No. 6117, 2013 15
Aggregation
Re-identification of aggregated data
Statistics from allele frequencies can be used to
identify genetic trial participants [1]
Presence of an individual in a group can be determined by
using allele frequencies and his DNA profile [2]
[1] R. Wang et al. “Learning Your Identity and Disease from Research Papers:
Information Leaks in Genome Wide Association Study.” ACM CCS, 2009
[2] N. Homer et al. Resolving individuals contributing trace amounts of DNA to highly
complex mixtures using high-density SNP genotyping microarrays. PLoS Genetics,
4, Aug. 2008 16
Kin Privacy
Quantifying how much privacy do relatives lose
when one’s genome is leaked?
M. Humbert et al., “Addressing the Concerns of the Lacks Family:
Quantification of Kin Genomic Privacy.” Proceedings of ACM CCS, 2013
Also read: “Routes for
breaching genetic privacy”
Y. Erlich and A. Narayanan,
Nature Review Genetics
Vol. 15, No. 6, 2014
18
Ethnographic Studies in WGS
Semi-structured interviews with 16 participants
Assessing perception of genetic tests, attitude toward WGS
programs, as well as perception of privacy/ethical issues
(Some) Preliminary results
1. Preferred method is through doctors not companies
(trust)
2. Labor/healthcare discrimination top concerns
3. Differences in correlation with income and education
E. De Cristofaro. “Users' Attitudes, Perception, and Concerns in the
Era of Whole Genome Sequencing.” (USEC 2014)
19
The rise of a new research community
Studying privacy issues
Exploring techniques to protect privacy
20
Differential Privacy
Computing number/location of SNPs associated to disease
Significance/correlation between a SNP and a disease
A. Johnson and V. Shmatikov. “Privacy-Preserving Data Exploration in
Genome-Wide Association Studies.” Proceedings of KDD, 2013
21
Privacy in Genome Wide Association Studies (GWAS)
Privacy-Preserving Genomic Tests
Individuals retain control of their sequenced
genome
Allow doctors/labs to run genetics tests, but:
1. Genome never disclosed, only test output is
2. Pharmas can keep test specifics confidential
… two main approaches …
22
(i)DNA
sample
(i) Clinical and
Environmental
data
(ii) Encrypted SNPs
(iii)Disease
Risk
Computation
CERTIFIED
INSTITUTION (CI)
MEDICAL
UNIT (MU)
STORAGE AND
PROCESSING UNIT (SPU)
PATIENT
(P)
1. Using Semi-Trusted Parties
23
1. Using Semi-Trusted Parties
Ayday et al. (WPES’13)
Data is encrypted and stored at a “Storage Process Unit”
Disease susceptibility testing
Ayday et al. (DPM’13)
Encrypting raw genomic data (short reads)
Allowing medical unit to privately retrieve them
Danezis and De Cristofaro (WPES’14)
Regression for disease susceptibility
24
doctor
or lab
genome
individual
test specifics
Secure
Function
Evaluation
test result test result
• Private Set Intersection (PSI)
• Authorized PSI
• Private Pattern Matching
• Homomorphic Encryption
• Garbled Circtuis
• […]
Output reveals nothing beyond
test result
• Paternity/Ancestry Testing
• Testing of SNPs/Markers
• Compatibility Testing
• […]
2. Users keep sequenced genomes
25
2. Users keep sequenced genomes
Baldi et al. (CCS’11)
Privacy-preserving version of a few genetic tests, based
on private set operations
Paternity test, Personalized Medicine, Compatibility Tests
(First work to consider fully sequenced genomes)
De Cristofaro et al. (WPES’12), extends the above
Framework and prototype deployment on Android
Adds Ancestry/Genealogy Testing
26
Private Set Intersection (PSI)
Server Client
S ={s1, ,sw} C ={c1, ,cv}
Private
Set Intersection
SÇC
27
Private Set Intersection
Cardinality-only (PSI-CA)
Server Client
S ={s1, ,sw} C ={c1, ,cv}
PSI-CA
SÇC
28
Private Set
Intersection
Cardinality
Test Result:
(#fragments with same length)
Private RFLP-based Paternity Test
29
Open Problems
Where do we store genomes?
Encryption can’t guarantee security past 30-50 yrs
Reliability and availability issues?
Cryptography
Efficiency overhead
Data representation assumptions
How much understanding required from users?
30
We all leave biological cells behind…
Hair, saliva, etc., can be collected and sequenced?
Compare this “attack” to re-identifying millions
of DNA donors or hacking into 23andme…
The former: expensive, prone to mistakes, only works
against a handful of targeted victims
The latter: very “scalable”
Why do we even care
about genome privacy?
31
Epilogue
Whole Genome Sequencing
A revolution in healthcare
Raises worrisome privacy/ethical concerns
The Genomic Privacy research community
Understanding the privacy issues
Privacy-preserving testing on whole genomes
Possible using efficient crypto protocols & cross-discipline
collaboration
A number of open research issues…
32
For more info:
http://genomeprivacy.org
Also:
E. Ayday, E. De Cristofaro, J.P. Hubaux, G. Tsudik.
“Whole Genome Sequencing: Revolutionary
Medicine or Privacy Nightmare?”
IEEE Computer Magazine
Thank you!
Special thanks to
E. Ayday, P. Baldi, R. Baronio, G. Danezis, S. Faber,
P. Gasti, J-P. Hubaux, B. Malin, G. Tsudik.

More Related Content

What's hot

Investigation of phylogenic relationships of shrew populations using genetic...
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...Juan Barrera
 
Hospital Microbiome Project Proposal
Hospital Microbiome Project ProposalHospital Microbiome Project Proposal
Hospital Microbiome Project Proposaldansmith01
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global communityExternalEvents
 
Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14mhaendel
 
Shining a light on virus infection with high-resolution microscopy
Shining a light on virus infection with high-resolution microscopyShining a light on virus infection with high-resolution microscopy
Shining a light on virus infection with high-resolution microscopyRoland Remenyi, PhD
 
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...ExternalEvents
 
CORONAVIRUS AKA SARS PATENT
CORONAVIRUS AKA SARS PATENT CORONAVIRUS AKA SARS PATENT
CORONAVIRUS AKA SARS PATENT ICJ-ICC
 
What's In a Genotype?: An Ontological Characterization for the Integration of...
What's In a Genotype?: An Ontological Characterization for the Integration of...What's In a Genotype?: An Ontological Characterization for the Integration of...
What's In a Genotype?: An Ontological Characterization for the Integration of...mhb120
 
Seminario biología molecular
Seminario biología molecular Seminario biología molecular
Seminario biología molecular IsabelaJARAMILLO1
 
485 lec2 history and review (i)
485 lec2 history and review (i)485 lec2 history and review (i)
485 lec2 history and review (i)hhalhaddad
 
Us7279327 Methods for producing recombinant coronavirus
Us7279327 Methods for producing recombinant coronavirusUs7279327 Methods for producing recombinant coronavirus
Us7279327 Methods for producing recombinant coronavirusCultura Arte Sociedad (Cuarso)
 
Basic knowledge of_viral_metagenome_vanshika-varshney
Basic knowledge of_viral_metagenome_vanshika-varshneyBasic knowledge of_viral_metagenome_vanshika-varshney
Basic knowledge of_viral_metagenome_vanshika-varshneyVanshikaVarshney5
 
Microbial Metagenomics and Human Health
Microbial Metagenomics and Human HealthMicrobial Metagenomics and Human Health
Microbial Metagenomics and Human HealthLarry Smarr
 
Human genome project
Human genome projectHuman genome project
Human genome projectAmjad Afridi
 
Line of Attack, Science Feb 27
Line of Attack, Science Feb 27Line of Attack, Science Feb 27
Line of Attack, Science Feb 27Jill Neimark
 
Evo genetic evidence_lt10
Evo genetic evidence_lt10Evo genetic evidence_lt10
Evo genetic evidence_lt10siskiyoukid
 
Supporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi RehmSupporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi RehmKnome_Inc
 
Quantified Self On Being A Personal Genomic Observatory
Quantified Self On Being A Personal Genomic ObservatoryQuantified Self On Being A Personal Genomic Observatory
Quantified Self On Being A Personal Genomic ObservatoryLarry Smarr
 

What's hot (19)

Investigation of phylogenic relationships of shrew populations using genetic...
Investigation of phylogenic relationships  of shrew populations using genetic...Investigation of phylogenic relationships  of shrew populations using genetic...
Investigation of phylogenic relationships of shrew populations using genetic...
 
Hospital Microbiome Project Proposal
Hospital Microbiome Project ProposalHospital Microbiome Project Proposal
Hospital Microbiome Project Proposal
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14
 
Shining a light on virus infection with high-resolution microscopy
Shining a light on virus infection with high-resolution microscopyShining a light on virus infection with high-resolution microscopy
Shining a light on virus infection with high-resolution microscopy
 
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
Application of Whole Genome Sequencing in the infectious disease’ in vitro di...
 
CORONAVIRUS AKA SARS PATENT
CORONAVIRUS AKA SARS PATENT CORONAVIRUS AKA SARS PATENT
CORONAVIRUS AKA SARS PATENT
 
What's In a Genotype?: An Ontological Characterization for the Integration of...
What's In a Genotype?: An Ontological Characterization for the Integration of...What's In a Genotype?: An Ontological Characterization for the Integration of...
What's In a Genotype?: An Ontological Characterization for the Integration of...
 
Seminario biología molecular
Seminario biología molecular Seminario biología molecular
Seminario biología molecular
 
485 lec2 history and review (i)
485 lec2 history and review (i)485 lec2 history and review (i)
485 lec2 history and review (i)
 
Us7279327 Methods for producing recombinant coronavirus
Us7279327 Methods for producing recombinant coronavirusUs7279327 Methods for producing recombinant coronavirus
Us7279327 Methods for producing recombinant coronavirus
 
Basic knowledge of_viral_metagenome_vanshika-varshney
Basic knowledge of_viral_metagenome_vanshika-varshneyBasic knowledge of_viral_metagenome_vanshika-varshney
Basic knowledge of_viral_metagenome_vanshika-varshney
 
Microbial Metagenomics and Human Health
Microbial Metagenomics and Human HealthMicrobial Metagenomics and Human Health
Microbial Metagenomics and Human Health
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
Line of Attack, Science Feb 27
Line of Attack, Science Feb 27Line of Attack, Science Feb 27
Line of Attack, Science Feb 27
 
Evo genetic evidence_lt10
Evo genetic evidence_lt10Evo genetic evidence_lt10
Evo genetic evidence_lt10
 
Eleanor Gallo_Hendrikx CV
Eleanor Gallo_Hendrikx CVEleanor Gallo_Hendrikx CV
Eleanor Gallo_Hendrikx CV
 
Supporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi RehmSupporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi Rehm
 
Quantified Self On Being A Personal Genomic Observatory
Quantified Self On Being A Personal Genomic ObservatoryQuantified Self On Being A Personal Genomic Observatory
Quantified Self On Being A Personal Genomic Observatory
 

Similar to The Genomics Revolution: The Good, The Bad, and The Ugly

The Genomics Revolution: The Good, The Bad, The Ugly
The Genomics Revolution: The Good, The Bad, The UglyThe Genomics Revolution: The Good, The Bad, The Ugly
The Genomics Revolution: The Good, The Bad, The UglyEmiliano De Cristofaro
 
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...Emiliano De Cristofaro
 
Crowdsourcing the Analysis of Genomes
Crowdsourcing the Analysis of GenomesCrowdsourcing the Analysis of Genomes
Crowdsourcing the Analysis of GenomesBastian Greshake
 
2010StanfordE25 Michele dragoescu e25 project
2010StanfordE25 Michele dragoescu e25 project2010StanfordE25 Michele dragoescu e25 project
2010StanfordE25 Michele dragoescu e25 projectmdragoescu
 
Hyoji genetic testing essay
Hyoji genetic testing essayHyoji genetic testing essay
Hyoji genetic testing essayHyoji
 
Personal Genomes: what can I do with my data?
Personal Genomes: what can I do with my data?Personal Genomes: what can I do with my data?
Personal Genomes: what can I do with my data?Melanie Swan
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management inscit2006
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2Eli Rosenthal
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2teamchaotex
 
Whole genome scanning, resolving clinical diagnosis and management amaist com...
Whole genome scanning, resolving clinical diagnosis and management amaist com...Whole genome scanning, resolving clinical diagnosis and management amaist com...
Whole genome scanning, resolving clinical diagnosis and management amaist com...koda004
 
TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)jmoore89
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsmikaelhuss
 
Methods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big dataMethods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big dataChirag Patel
 
Grand round whsiao_may2015
Grand round whsiao_may2015Grand round whsiao_may2015
Grand round whsiao_may2015IRIDA_community
 
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality?  - William HsiaoHow Can We Make Genomic Epidemiology a Widespread Reality?  - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality? - William HsiaoWilliam Hsiao
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13Russ Altman
 
Gen epio immem_griffiths
Gen epio immem_griffithsGen epio immem_griffiths
Gen epio immem_griffithsIRIDA_community
 
IRIDA's Genomic epidemiology application ontology (GenEpiO): Genomic, clinica...
IRIDA's Genomic epidemiology application ontology (GenEpiO): Genomic, clinica...IRIDA's Genomic epidemiology application ontology (GenEpiO): Genomic, clinica...
IRIDA's Genomic epidemiology application ontology (GenEpiO): Genomic, clinica...Emma Griffiths
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challengesinside-BigData.com
 
Bioinformatics Strategies for Exposome 100416
Bioinformatics Strategies for Exposome 100416Bioinformatics Strategies for Exposome 100416
Bioinformatics Strategies for Exposome 100416Chirag Patel
 

Similar to The Genomics Revolution: The Good, The Bad, and The Ugly (20)

The Genomics Revolution: The Good, The Bad, The Ugly
The Genomics Revolution: The Good, The Bad, The UglyThe Genomics Revolution: The Good, The Bad, The Ugly
The Genomics Revolution: The Good, The Bad, The Ugly
 
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
The Genomics Revolution: The Good, The Bad, and The Ugly (Confessions of a Pr...
 
Crowdsourcing the Analysis of Genomes
Crowdsourcing the Analysis of GenomesCrowdsourcing the Analysis of Genomes
Crowdsourcing the Analysis of Genomes
 
2010StanfordE25 Michele dragoescu e25 project
2010StanfordE25 Michele dragoescu e25 project2010StanfordE25 Michele dragoescu e25 project
2010StanfordE25 Michele dragoescu e25 project
 
Hyoji genetic testing essay
Hyoji genetic testing essayHyoji genetic testing essay
Hyoji genetic testing essay
 
Personal Genomes: what can I do with my data?
Personal Genomes: what can I do with my data?Personal Genomes: what can I do with my data?
Personal Genomes: what can I do with my data?
 
Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management Evolution of Knowledge Discovery and Management
Evolution of Knowledge Discovery and Management
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2
 
Dna profiling presentation x2
Dna profiling presentation x2Dna profiling presentation x2
Dna profiling presentation x2
 
Whole genome scanning, resolving clinical diagnosis and management amaist com...
Whole genome scanning, resolving clinical diagnosis and management amaist com...Whole genome scanning, resolving clinical diagnosis and management amaist com...
Whole genome scanning, resolving clinical diagnosis and management amaist com...
 
TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)
 
Emerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomicsEmerging challenges in data-intensive genomics
Emerging challenges in data-intensive genomics
 
Methods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big dataMethods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big data
 
Grand round whsiao_may2015
Grand round whsiao_may2015Grand round whsiao_may2015
Grand round whsiao_may2015
 
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality?  - William HsiaoHow Can We Make Genomic Epidemiology a Widespread Reality?  - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13
 
Gen epio immem_griffiths
Gen epio immem_griffithsGen epio immem_griffiths
Gen epio immem_griffiths
 
IRIDA's Genomic epidemiology application ontology (GenEpiO): Genomic, clinica...
IRIDA's Genomic epidemiology application ontology (GenEpiO): Genomic, clinica...IRIDA's Genomic epidemiology application ontology (GenEpiO): Genomic, clinica...
IRIDA's Genomic epidemiology application ontology (GenEpiO): Genomic, clinica...
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
 
Bioinformatics Strategies for Exposome 100416
Bioinformatics Strategies for Exposome 100416Bioinformatics Strategies for Exposome 100416
Bioinformatics Strategies for Exposome 100416
 

Recently uploaded

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Nikki Chapple
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentMahmoud Rabie
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...Karmanjay Verma
 
Why Agile? - A handbook behind Agile Evolution
Why Agile? - A handbook behind Agile EvolutionWhy Agile? - A handbook behind Agile Evolution
Why Agile? - A handbook behind Agile EvolutionDEEPRAJ PATHAK
 

Recently uploaded (20)

How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
Microsoft 365 Copilot: How to boost your productivity with AI – Part two: Dat...
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Digital Tools & AI in Career Development
Digital Tools & AI in Career DevelopmentDigital Tools & AI in Career Development
Digital Tools & AI in Career Development
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...React JS; all concepts. Contains React Features, JSX, functional & Class comp...
React JS; all concepts. Contains React Features, JSX, functional & Class comp...
 
Why Agile? - A handbook behind Agile Evolution
Why Agile? - A handbook behind Agile EvolutionWhy Agile? - A handbook behind Agile Evolution
Why Agile? - A handbook behind Agile Evolution
 

The Genomics Revolution: The Good, The Bad, and The Ugly

  • 1. The Genomics Revolution: The Good, The Bad, and The Ugly (A Privacy Researcher's Perspective) Emiliano De Cristofaro University College London https://emilianodc.com
  • 2. In a nutshell The Good Revolution in medicine and healthcare Genetic testing for the masses The Bad Collection of highly sensitive data Very hard to anonymize / de-identify The Ugly Greater good vs privacy Encryption might not be the answer 2
  • 3. 3
  • 4. WGS Progress History 1970s: DNA sequencing starts 1990: The “Human Genome Project” starts 2003: First human genome fully sequenced 2012: UK announces sequencing of 100K genomes 2015: USA announces sequencing of 1M genomes $$$ $3B: Human Genome Project $250K: Illumina (2008) $5K: Complete Genomics (2009), Illumina (2011) $1K: Illumina (2014) 4
  • 5. How to read the genome? Genotyping Testing for genetic differences using a set of markers Sequencing Determining the full nucleotide order of an organism’s genome 5
  • 6. 6
  • 7. 7
  • 8. 8
  • 9. 9
  • 10.
  • 11. But… not all data are created equal! 11
  • 12. Privacy Researcher’s Perspective Treasure trove of sensitive information Ethnic heritage, predisposition to diseases Genome = the ultimate identifier Hard to anonymize / de-identify Sensitivity is perpetual Cannot be “revoked” Leaking one’s genome ≈ leaking relatives’ genome 12
  • 14. The rise of a new research community Studying privacy issues Exploring techniques to protect privacy 14
  • 15. De-Anonymization Melissa Gymrek et al. “Identifying Personal Genomes by Surname Inference.” Science Vol. 339, No. 6117, 2013 15
  • 16. Aggregation Re-identification of aggregated data Statistics from allele frequencies can be used to identify genetic trial participants [1] Presence of an individual in a group can be determined by using allele frequencies and his DNA profile [2] [1] R. Wang et al. “Learning Your Identity and Disease from Research Papers: Information Leaks in Genome Wide Association Study.” ACM CCS, 2009 [2] N. Homer et al. Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays. PLoS Genetics, 4, Aug. 2008 16
  • 17. Kin Privacy Quantifying how much privacy do relatives lose when one’s genome is leaked? M. Humbert et al., “Addressing the Concerns of the Lacks Family: Quantification of Kin Genomic Privacy.” Proceedings of ACM CCS, 2013 Also read: “Routes for breaching genetic privacy” Y. Erlich and A. Narayanan, Nature Review Genetics Vol. 15, No. 6, 2014
  • 18. 18
  • 19. Ethnographic Studies in WGS Semi-structured interviews with 16 participants Assessing perception of genetic tests, attitude toward WGS programs, as well as perception of privacy/ethical issues (Some) Preliminary results 1. Preferred method is through doctors not companies (trust) 2. Labor/healthcare discrimination top concerns 3. Differences in correlation with income and education E. De Cristofaro. “Users' Attitudes, Perception, and Concerns in the Era of Whole Genome Sequencing.” (USEC 2014) 19
  • 20. The rise of a new research community Studying privacy issues Exploring techniques to protect privacy 20
  • 21. Differential Privacy Computing number/location of SNPs associated to disease Significance/correlation between a SNP and a disease A. Johnson and V. Shmatikov. “Privacy-Preserving Data Exploration in Genome-Wide Association Studies.” Proceedings of KDD, 2013 21 Privacy in Genome Wide Association Studies (GWAS)
  • 22. Privacy-Preserving Genomic Tests Individuals retain control of their sequenced genome Allow doctors/labs to run genetics tests, but: 1. Genome never disclosed, only test output is 2. Pharmas can keep test specifics confidential … two main approaches … 22
  • 23. (i)DNA sample (i) Clinical and Environmental data (ii) Encrypted SNPs (iii)Disease Risk Computation CERTIFIED INSTITUTION (CI) MEDICAL UNIT (MU) STORAGE AND PROCESSING UNIT (SPU) PATIENT (P) 1. Using Semi-Trusted Parties 23
  • 24. 1. Using Semi-Trusted Parties Ayday et al. (WPES’13) Data is encrypted and stored at a “Storage Process Unit” Disease susceptibility testing Ayday et al. (DPM’13) Encrypting raw genomic data (short reads) Allowing medical unit to privately retrieve them Danezis and De Cristofaro (WPES’14) Regression for disease susceptibility 24
  • 25. doctor or lab genome individual test specifics Secure Function Evaluation test result test result • Private Set Intersection (PSI) • Authorized PSI • Private Pattern Matching • Homomorphic Encryption • Garbled Circtuis • […] Output reveals nothing beyond test result • Paternity/Ancestry Testing • Testing of SNPs/Markers • Compatibility Testing • […] 2. Users keep sequenced genomes 25
  • 26. 2. Users keep sequenced genomes Baldi et al. (CCS’11) Privacy-preserving version of a few genetic tests, based on private set operations Paternity test, Personalized Medicine, Compatibility Tests (First work to consider fully sequenced genomes) De Cristofaro et al. (WPES’12), extends the above Framework and prototype deployment on Android Adds Ancestry/Genealogy Testing 26
  • 27. Private Set Intersection (PSI) Server Client S ={s1, ,sw} C ={c1, ,cv} Private Set Intersection SÇC 27
  • 28. Private Set Intersection Cardinality-only (PSI-CA) Server Client S ={s1, ,sw} C ={c1, ,cv} PSI-CA SÇC 28
  • 29. Private Set Intersection Cardinality Test Result: (#fragments with same length) Private RFLP-based Paternity Test 29
  • 30. Open Problems Where do we store genomes? Encryption can’t guarantee security past 30-50 yrs Reliability and availability issues? Cryptography Efficiency overhead Data representation assumptions How much understanding required from users? 30
  • 31. We all leave biological cells behind… Hair, saliva, etc., can be collected and sequenced? Compare this “attack” to re-identifying millions of DNA donors or hacking into 23andme… The former: expensive, prone to mistakes, only works against a handful of targeted victims The latter: very “scalable” Why do we even care about genome privacy? 31
  • 32. Epilogue Whole Genome Sequencing A revolution in healthcare Raises worrisome privacy/ethical concerns The Genomic Privacy research community Understanding the privacy issues Privacy-preserving testing on whole genomes Possible using efficient crypto protocols & cross-discipline collaboration A number of open research issues… 32
  • 33. For more info: http://genomeprivacy.org Also: E. Ayday, E. De Cristofaro, J.P. Hubaux, G. Tsudik. “Whole Genome Sequencing: Revolutionary Medicine or Privacy Nightmare?” IEEE Computer Magazine
  • 34. Thank you! Special thanks to E. Ayday, P. Baldi, R. Baronio, G. Danezis, S. Faber, P. Gasti, J-P. Hubaux, B. Malin, G. Tsudik.