1. Yaniv Erlich@erlichya10/17/18
The hitchhiker’s guide to genome
hacking
Chief Science Officer, MyHeritage
Associate Professor of Computer Science, Columbia University
Adjunct Core Member, New York Genome Center
Dr. Yaniv Erlich
@erlichya
Slides are publicly available on:
2. Yaniv Erlich@erlichya10/17/18
We need to share data
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
Zielinski et al., 2013. Guardian provided permission to share this photo.
4. Yaniv Erlich@erlichya10/17/18
Taxonomy of genome hacking techniques
Attribute disclosure
(whose is this phenotype, anyhow?)
Identity tracing
(whose is this genome, anyhow?)
Attacks
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
Genealogical
triangulation Side-channel
Phenotype
prediction
5. Yaniv Erlich@erlichya10/17/18
The shape of identity tracing attacks
Anonymity means being lost in the crowd.
Identity tracing: narrow down the crowd from the sample
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
6. Yaniv Erlich@erlichya10/17/18
How much information do we need?
Identify every person in the US: log2(325*106) = 28bits
Surprisingly small!
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
8. Yaniv Erlich@erlichya10/17/18
Correlation between Y-chr and surnames
www.ysearch.org:Y
Y
Smith
Smith
Y
Smith
Erlich
Genetic privacy
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
9. Yaniv Erlich@erlichya10/17/18
Surname inference
In total:
5 successful surname recoveries
Patrilineal line from source to target
Person tested by genetic genealogy service (source)
Genetic privacy
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
10. Yaniv Erlich@erlichya10/17/18
Long range familial searches
Genetic privacy
Come to my Plenary Abstract talk on Friday at 6:40pm
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
12. Yaniv Erlich@erlichya10/17/18
Side channel @ Personal Genomes Project
Genetic privacy
Sweeney et al., 2013
Oops: default filename has the name of the individual!
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
13. Yaniv Erlich@erlichya10/17/18
PGP made it looks like everything is fine (but it is not)
Genetic privacy
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
14. Yaniv Erlich@erlichya10/17/18
Can we use DNA phenotyping for identity tracing?
We can predict phenotypes from genetic data!
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
15. Yaniv Erlich@erlichya10/17/18
Not so fast…
Genetic privacy
Height at a single cm resolution
Perfect
knowledge
Theoretical limit(h2)Current knowledge
(Lello et al., Genetics, 2018)
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
17. Yaniv Erlich@erlichya10/17/18
Not very promising…
Genetic privacy
Accuracy of prediction(R2)
[bits]
Current knowledge
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
18. Yaniv Erlich@erlichya10/17/18
Challenges in face prediction
Genetic privacy
Working hypothesis: identifying individuals based on
phenotype inferences is a typically a limited approach
Fig. S11 in Venter
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary
19. Yaniv Erlich@erlichya10/17/18
Do we really understand privacy?
Genetic Privacy
AshleyMadison.com JoinAllofUS.org
Intro. Genealogical attacks Side-channel attacks Phenotypic attacks Summary