Open human genome data - presentation at the annual TKT/CLIDP doctoral programme symposium 2016 "Open up! – Open Data and Open Access" of the University of Turku.
Don't Miss Out: Strategies for Making the Most of the Ethena DigitalOpportunity
Open human genome data
1. Open human genome data
Open up! – Open Data and Open Access
Annual TKT/CLIDP Symposium
University of Turku, 13th May, 2016
Marja Pirttivaara, PhD, MBA (social and healthcare management)
3. Human genome: abt 3 billion pairs of nucleotides.
Variants are Single Nucleotide Polymorphisms, SNPs.
Basics of genome and cost of sequencing
Wetterstrand KA. DNA Sequencing Costs: Data from the NHGRI Genome Sequencing
Program (GSP) Available at: www.genome.gov/sequencingcosts. Accessed 13th May, 2016.
13th May, 2016, Marja Pirttivaara
4. Human genome – human evolution
“The human genome is contained in chromosomes that are present in almost
every cell in our bodies. It is composed of approximately 3.2 billion
nucleotides. When cells replicate to form germ cells that will contribute to
the next generation, mutations occur. As a result of these mutations,
about 50 to 200 new substitutions exist in every new individual that
is born. These substitutions accumulate in the genome over time to the
extent that roughly one nucleotide in a thousand differs between two human
genomes today, whereas roughly one nucleotide in a hundred differs
between a human and a chimpanzee genome. In addition, duplicated DNA
sequences differ both between individuals and between species.”
Svante Pääbo: The Contribution of Ancient Hominin Genomes from Siberia to
Our Understanding of Human Evolution, Herald of the Russian Academy of
Sciences, Vol 85 No. 5, 2015
http://www.eva.mpg.de/documents/RussianAcadSciences/Paeaebo_Contribut
ion_HeraldRussAcadSci_2015_2226501.pdf
13th May, 2016, Marja Pirttivaara
6. • KardioKompassi is FIMM’s first preventative health care pilot project utilizing
personal genetic risk information and returning it to the participants.
• Cardio-vascular diseases, appr. 50 snips chosen.
• Partners: FIMM, Finnish Red Cross Blood Services and the Finnish Innovation
Fund Sitra.
• Aim: study the ways of providing people with health-risk information based on
genetic research data, the ways this information is used in preventive
healthcare and its usefulness with respect to individual health behaviour.
• In this project, the transfer of genetic information to an individual’s personal
online health account was also being tested for the first time in Finland.
• The application developed during the project is further developed and utilized
in new research projects.
Kardiokompassi – unique Finnish pilot
Marja Pirttivaara, 13th May 2016
7. “Many, many more individuals will have to make their personal genomes publicly
available before we begin to get a real feeling of where we want to go.”
"I am homozygous for the “10” variant of the P450 drug metabolizing gene,
CYP2P6 . As a result, I metabolize beta-blockers much more slowly than most
other Caucasians. Before I take this knowledge, my use of beta-blockers to
control my blood pressure caused me to constantly fall asleep at inappropriate
moments. Instead of a daily pill, I now take one every week….”
James Watson: Living with my personal
genome
Ref:http://www.futuremedicine.com/doi/pdf/10.2217/pme.09.62
Picture: http://www.fraxa.org/fraxa/advisors/
Marja Pirttivaara 13th May 2016
8. 8
Open data, open science, mydata,
big data, big science, open tools – and open minds
The Opportunity Project in the US
“..they've challenged the nation to ask not
what your country can code for you, but
ask what you can code for your country.” …
What I saw at the launch of The
Opportunity Project yesterday suggested a
shift in an approach that has promise.
Instead of simply dumping a data set onto
Data.gov and challenging people to use it,
the White House worked with over 30 tech
companies and nonprofits to develop
prototypes of new tools or add features to
existing platforms…”
People
ToolsData
http://www.techrepublic.com/article/president-obamas-new-open-data-initiative-could-help-cities-help-themselves
https://www.whitehouse.gov/the-press-office/2016/03/07/fact-sheet-white-house-launches-opportunity-project-utilizing-open-data
13th May, 2016, Marja Pirttivaara
9. 9
Genome data in open databases
• Primary nucleotide sequence databases:
• GenBank (NCBI / NIH, USA)
http://www.ncbi.nlm.nih.gov/genbank/
• EMBL (European Bioinformatics Institute)
European Nucleotide Archive
http://www.ebi.ac.uk/ena
• DDBJ, DNA Data Bank of Japan (National
Institute of Genetics)
http://www.ddbj.nig.ac.jp/
• Meta databases
• Genome databases
• Protein sequence databases
• Proteomics databases
• Protein structure databases
• Protein model databases
• RNA databases
• Carbohydrate structure databases
• Protein-protein and other molecular
interactions
• Signal transduction pathway databases
• Metabolic pathway and Protein Function
databases
• Microarray databases
• Exosomal databases
• Mathematical model databases
• PCR and quantitative PCR primer databases
• Phenotype databases
• Specialized databases
• Taxonomic databases
• Wiki-style databases
• Metabolomic Databases
• ETC
13th May, 2016, Marja Pirttivaarahttps://en.wikipedia.org/wiki/List_of_biological_databases
10. 10
Biobanks
• Biobank is ”clinical samples + data +
informed consent” according to the
Finnish biobank regulation.
• Globally, biobanks will form a global
evolving network of actors and
actions: samples, labs & testing,
data storage, consents, data
sharing, tools, cooperation,
competences…
• Ethics!
“Biobank is a collection of biological
samples and data gathered with
the donor’s consent for future
medical research and product
development for healthcare and
health promotion purposes.”
“Your consent could be crucial for
the development of new medicines
and treatments. Your sample could
change the world!”
http://www.biopankki.fi/
www.genome.gov 13th May, 2016, Marja Pirttivaara
11. 11
Open data deserves open tools: tools for genome data
Autosomal DNA tools
• http://isogg.org/wiki/Autosomal_DN
A_tools
Y-DNA tools
• http://isogg.org/wiki/Y-DNA_tools
Mitochondrial DNA Tools
• http://isogg.org/wiki/MtDNA_tools
CSC Bioinformatics Tools
• https://research.csc.fi/bioscience-
programs
BLAST finds regions of similarity between
biological sequences
• http://blast.ncbi.nlm.nih.gov/Blast.cgi
BEAST Bayesian evolutionary analysis by
sampling trees (Markov Chain Monte Carlo
simulation)
• http://beast2.org/
R language
• https://www.r-project.org/
13th May, 2016, Marja Pirttivaara
12. 1213th May, 2016, Marja Pirttivaara
Philip E. Bourne: Open data in a Global Ecosystem, Nov. 2015
http://www.slideshare.net/pebourne/open-data-in-a-global-ecosystem
13. • Family Tree DNA / Gene by Gene
www.familytreedna.com
• 23andMe
www.23andme.com
• Ancestry
www.ancestry.com
• National Geographic
http://genographic.nationalgeographic.com
The Big Four in DTC
Marja Pirttivaara 13th May 2016
14. 14
Donation of mtDNA to the NIH NCBI GenBank
Marja Pirttivaara 13th May 2016http://www.ncbi.nlm.nih.gov/
15. 15
SNPedia and Promethease
• SNPedia is the Database
• Promethease is a Program to
Personalize
• Upload a file of genotypes with
dbSNP IDs and Promethease makes
a personalized report
• It uses SNPedia to find and complie
the scientific literature specific to
your DNA.
• Personal data is not stored, shared
or sold.
• http://www.snpedia.com/index.php/SNPedia
• https://www.snpedia.com/index.php/Promethease
• https://www.promethease.com/
• Antonio Regalado: How a Wiki Is Keeping Direct-to-
Consumer Genetics Alive, MIT Technology Review 2014
www.technologyreview.com/featuredstory/531461/how-
a-wiki-is-keeping-direct-to-consumer-genetics-alive/
Marja Pirttivaara 13th May 2016
Map of SNPedia usage in Finland,
thx Mike Cariaso, SNPedia
16. 16Sitra • Name • 13.5.2016 •
Example from a personalized Promethease report
17. What do people want to know?
●What will I die from? When will I die?
●What diseases am I at higher risk for, and by how much?
●What diseases might my children inherit?
●Are some drugs better than others for me? Or worse?
●Who can explain all this to me?
●Will my health-care providers understand it?
Thx to Mike Cariaso and Greg Lennon, SNPedia
Marja Pirttivaara 13th May 2016
18. Stanford course Gene 210 – Genomics and
Personalized Medicine
Marja Pirttivaara 13th May 2016
Course includes analysis of own DTC genome data and SNPedia writing
http://web. stanford.edu/class/gene210/web/html/welcome.html
http://www.snpedia.com/index.php/Rs1800497
19. 19
National Genome Strategy: Improving health through
the use of genomic data
Marja Pirttivaara 13th May 2016http://stm.fi/en/genomicdata
20. • Who owns the data?
• Who can use the data?
• Informed consent
• Trust
• Transparency
• Incidental findings
• Right to know and right not to know.
20
Privacy and ethics
13th May, 2016, Marja Pirttivaara
www.genome.gov
21. New cooperation, new cultures, new skills,
new methodologies, new tools, new emphasis,
new modes of discovery...
New era
Marja Pirttivaara 13th May 2016www.genome.gov