The document provides an overview of biobanking from the perspective of a user. It discusses three examples of biobanking: 1) Using postmortem brain samples from the NIH NeuroBioBank to validate findings related to Sturge-Weber syndrome. 2) Establishing a biobank for Sturge-Weber syndrome. 3) Discovering mosaic mutations in autism samples by analyzing genomic data and then validating findings using samples from existing biobanks. It also outlines several issues, lessons, and principles for biobanking including usefulness, existing biobanks, importance of identifiers, role of data science, standards, informed consent, and ongoing needs and opportunities.
Artifacts in Nuclear Medicine with Identifying and resolving artifacts.
Biobanking a user’s perspective: Dr. Jonathan Pevsner
1. Biobanking: a user’s perspective
and an overview
Jonathan Pevsner, Ph.D.
Professor, Dept. of Neurology
Kennedy Krieger Institute and Johns Hopkins Medicine
Chief Scientific Officer, Sturge-Weber Foundation
pevsner@kennedykrieger.org
Data Science Forum: NIH Data Science SIG
Global Perspective on Biobanking and Access to Samples
June 23, 2017
3. Outline
From genotype to phenotype: a framework
Three biobanking examples
Postmortem brains from the NIH NeuroBioBank
Establishing a rare disease biobank
From a large genomics dataset to biobank samples
Issues, lessons and principles
1. Usefulness
2. Existing biobanks
3. GUIDs: the importance of labels
4. Data science is integral to biobanking
5. Standards
6. Informed consent
7. Needs and opportunities
4. The relationship between genotype and phenotype
represents one of the most fundamental and challenging
problems in biomedical science.
Fundamental framework: genotype to phenotype
Genotype Phenotype
5. Fundamental framework: genotype to phenotype
Genotype Phenotype
DNA individual populationorgancellprotein
We can provide a framework for this problem.
6. Fundamental framework: genotype to phenotype
Genotype Phenotype
DNA individual populationorgancellprotein
We can provide a framework for this problem.
RNA pathways circuits
7. Fundamental framework: genotype to phenotype
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
…
8. Fundamental framework: genotype to phenotype
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
ABCD1
severe childhood disease (ALD)
mild adult onset disease (AMN)
apparently normal
One gene mutation can have different phenotypic
consequences: the same ABCD1 mutation may result in severe
childhood-onset adrenoleukodystrophy (ALD), milder adult-
onset adrenomyeloneuropathy (AMN), or no symptoms.
9. Fundamental framework: genotype to phenotype
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
GNAQ
melanocytes: uveal melanoma
endothelial cells: Sturge-Weber
blood: apparently normal
One gene mutation can have different consequences:
when and where mutations occur is crucial.
10. Fundamental framework: genotype to phenotype
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
…
one disease phenotype,
multiple genetic contributors
For almost all diseases (including common diseases such
as autism or bipolar disorder) we search for multiple
genetic variants that confer risk for a phenotype
11. Outline
From genotype to phenotype: a framework
Three biobanking examples
Postmortem brains from the NIH NeuroBioBank
Establishing a rare disease biobank
From a large genomics dataset to biobank samples
Issues, lessons and principles
1. Usefulness
2. Existing biobanks
3. GUIDs: the importance of labels
4. Data science is integral to biobanking
5. Standards
6. Informed consent
7. Needs and opportunities
12. A port-wine birthmark affects about 1:300 people.
It varies in size and location.
Sturge-Weber syndrome affects < 1:20,000 people.
It affects a subset of individuals with a facial PW birthmark.
A user’s perspective on biobanking: three examples.
(1) Sturge-Weber syndrome and a brain bank
13. A user’s perspective on biobanking: three examples.
(1) Sturge-Weber syndrome and a brain bank
DNA from
blood
(presumed
unaffected)
DNA from port-
wine birthmark
(presumed
affected)
14. A user’s perspective on biobanking: three examples.
(1) Sturge-Weber syndrome and a brain bank
DNA from
blood
(presumed
unaffected)
DNA from port-
wine birthmark
(presumed
affected)
sequence the
genome
sequence the
genome
compare
We identified a mosaic mutation in GNAQ as causing Sturge-
Weber syndrome and port-wine birthmarks (NEJM, PMID
23656586).We analyzed samples from 3 individual patients.
15. A user’s perspective on biobanking: three examples.
(1) Sturge-Weber syndrome and a brain bank
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
GNAQ
After finding the GNAQ mutation we turned to the NIH
NeuroBioBank at the University of Maryland.We obtained
97 samples to validate our findings.The availability of these
samples from a biobank was crucial!
16. A user’s perspective on biobanking: three examples.
(2) Establishing a Sturge-Weber syndrome biobank
I am Chief Scientific Officer of the Sturge-Weber
Foundation.We need to create (or join) a biobank.
• Patients and families tell me “I want to donate my brain
and body to science. Can you help?”What’s the plan;
and are there informed consent issues?
• Scientists have discovered that the GNAQ mutation
occurs primarily in endothelial cells, and cell lines
have been established from brain biopsies. How can
researchers share and access these cell lines?
• Are there standards that we should follow in describing
the genotype and phenotype of Sturge-Weber
syndrome samples and patients?
• Have these problems been addressed by those studying
related diseases?
17. A user’s perspective on biobanking: three examples.
(2) Establishing a Sturge-Weber syndrome biobank
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
GNAQ
It’s important to link clinical data (e.g. from a patient
registry) with data generated from biospecimens!
18. A user’s perspective on biobanking: three examples.
(2) Establishing a Sturge-Weber syndrome biobank
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
GNAQ
What information do we
need to capture about cell
lines, brain, and skin samples?
19. A user’s perspective on biobanking: three examples.
(2) Establishing a Sturge-Weber syndrome biobank
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
GNAQ
What information do we
need to capture about cell
lines, brain, and skin samples?
How do we relate genomic DNA
sequence findings, RNA-seq,
proteomics to the samples?
20. A user’s perspective on biobanking: three examples.
(2) Establishing a Sturge-Weber syndrome biobank
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
GNAQ
What information do we
need to capture about cell
lines, brain, and skin samples?
What information do we need
to capture about the
phenotypes as we collect
samples at diverse sites?
How do we relate genomic DNA
sequence findings, RNA-seq,
proteomics to the samples?
21. A user’s perspective on biobanking: three examples.
(3) Discovering mosaic mutations in 9000 autism samples
We asked whether mosaic mutations occur in autism. By
applying to NIH we obtained previously generated whole
exome sequence data on 9000 individuals via the Simons
Simplex Collection (SSC).We discovered mosaic variation is
enriched in children with autism spectrum disorder.
To validate our findings, we applied to the Simons
Foundation and received approval to obtain DNA from a
Rutgers repository (http://www.rucdr.org/).We purchased
300 DNA samples and successfully validated our findings.
See PMID 27632392:
22. A user’s perspective on biobanking: three examples.
(3) Discovering mosaic mutations in 9000 autism samples
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
…
A user starts with
genomics data…
23. A user’s perspective on biobanking: three examples.
(3) Discovering mosaic mutations in 9000 autism samples
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
…
…then purchases cell lines
or DNA or brain chunks
for further studies…
A user starts with
genomics data…
24. A user’s perspective on biobanking: three examples.
(3) Discovering mosaic mutations in 9000 autism samples
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
…
…then purchases cell lines
or DNA or brain chunks
for further studies…
Obtaining clinical phenotypes
from the biobank is essential.
A user starts with
genomics data…
25. Outline
From genotype to phenotype: a framework
Three biobanking examples
Postmortem brains from the NIH NeuroBioBank
Establishing a rare disease biobank
From a large genomics dataset to biobank samples
Issues, lessons and principles
1. Usefulness
2. Existing biobanks
3. GUIDs: the importance of labels
4. Data science is integral to biobanking
5. Standards
6. Informed consent
7. Needs and opportunities
26. (1) Usefulness
• Diseases are considered rare when affecting 200,000
or fewer people (U.S. definition) or fewer than
1:2,000 people (European definition).
• There are ~6,800 rare diseases.
• Biobanks offer crucial resources to help solve the
causes of rare diseases—and to study diagnosis,
prevention, and treatment.
• Biobanks offer a range of cell, solid tissue types (e.g.
brain, heart, fibroblasts, lymphoblastoid cell
lines) and bodily fluids.
• Biobanks offer biospecimens from individuals,
pedigrees, and/or populations.
• Samples from biobanks are complemented by
phenotypic and genotypic data.
27. List of panelists
Jonathan Pevsner, Professor, at the Dept. of Neurology, Kennedy Krieger
Institute. Presentation title: Biobanking user’s perspective and an overview
Dept. of Psychiatry and Behavioral Sciences, Johns Hopkins Medicine
David van Enckevort, Project Manager BBMRI & RD-Connect,Department of
Genetics, University Medical Center Groningen (UMCG). Presentation title: “FAIR
(Findable, Accessible, Interoperable and Reusable) data and sample access “
Manuel Posada de la Paz. Director, Research Institute for Rare Diseases
(Instituto de Investigación en Enfermedades Raras), a member of the EuroBioBank.
Presentation title: Rare diseases biological samples: small collections and research.
Kerry Wiles, Program Director- VUMC Tissue Repository, CHTN (Cooperative
Human Tissue Network) Western Coordinator. Presentation title: An academic
prospective procurement repository: From Donor to Bench
Jim Vaught Editor-in-Chief, Biopreservation Journal, past President of the
International Society for Biological and Environmental Repositories (ISBER), on the
board of directors for ISBER and NDRI (National Disease Research Interchange),
Presentation title: "NIH and ISBER perspectives on specimen locators"
Daniel Catchpoole Director of Kids Research Institute, The Children's Hospital at
Westmead (Australia). Presentation title: The Australian experience, issues and
solution
28. (2) Examples of existing biobanks and biobank initiatives
Coriell Biorepository
The NIGMS collection has >11,000 cell lines and
~6,000 DNA samples.
https://catalog.coriell.org/
NIH NeuroBioBank
6 sites.The University of Maryland Brain &Tissue Bank
has distributed 35,000 tissue samples to >900
researchers.
https://neurobiobank.nih.gov/
Cooperative Human Tissue Network (CHTN)
Supported by the National Cancer Institute
https://www.chtn.org/
29. EuroBioBank
130,000 samples available; 13,000 collected and >7,000
samples distributed per year.
http://www.eurobiobank.org/
RD-Connect
"An integrated platform connecting databases,
registries, biobanks and clinical bioinformatics for rare
disease research.”
http://rd-connect.eu/
Research Institute for Rare Diseases
(Instituto de Investigación en Enfermedades Raras), a
member of the EuroBioBank.
http://www.eurobiobank.org/en/partners/description/
isciii.htm
(2) Examples of existing biobanks and biobank initiatives
30. BBMRI-ERIC
Biobanking and biomolecular resources research
infrastructure-European Research Infrastructure
Consortium.
http://www.bbmri-eric.eu/BBMRI-ERIC/common-service-it/
Kids Research Institute,The Children's Hospital at
Westmead (Australia)
http://www.kidsresearch.org.au/our-facilities/bio-banks
National Disease Research Interchange (NDRI)
The mission of NDRI is to provide human biospecimens to
advance biomedical/bioscience research and development
worldwide.”
http://ndriresource.org/
(2) Examples of existing biobanks and biobank initiatives
31. All of Us
“The All of Us Research Program seeks to extend
precision medicine to all diseases by building a national
research cohort of one million or more U.S. participants.”
It includes a biobank.
https://allofus.nih.gov/about/program-components
NIMH Repository and Genomics Resource (NIMH-RGR)
“…plays a key role in facilitating psychiatric genetic
research by providing a collection of over 150,000 well
characterized, high quality patient and control samples
from a wide-range of mental disorders.”
https://www.nimhgenetics.org/
(2) Examples of existing biobanks and biobank initiatives
32. (3) GUIDs: the importance of labels
“Accession numbers” are alpha-numeric characters that
provide links to various kinds of data or records. For
example, NP_620258.1 is an accession number
corresponding to a protein sequence.
A GUID is a Global Unique Identifier that corresponds to
a study participant.The GUID facilitates tracking patient’s
data across studies and location and over time in a
deidentified manner.
Example 1: a participant was recruited twice (years apart)
to a single study.
Example 2: a trio was recruited into two separate autism
genome sequencing studies (one study excluded severe
phenotypes, one excluded mild phenotypes).The proband’s
phenotype had become severe over time.
33. (4) Data science is integral to biobanking
Biobanking requires a series of tasks such as obtaining
biospecimens and associated metadata (e.g. phenotypic
data, cause of death, postmortem interval, cell culture
conditions, imaging data, genomics data).
Goals include effective communication, standardization
(e.g. of protocols), and an electronic portal to a
repository.
All this requires data science.
34. Biobanks must integrate diverse data types
Genotype Phenotype
DNA individual populationorgancellprotein
gene1
gene20,000
gene2
…
Sequence data:
Genomic DNA
(dbGaP), RNAseq
Proteomics data,
metabolomics
imaging data
phenotypic test
data (e.g.
neuropsychology)
cell culture,
biochemical,
iPSC data
epidemiology
35. (5) Standards
Biobanks implement Data Dictionaries to manage data
elements (and data structures) in a uniform manner.The
use of Common Data Elements is crucial.
NIH Common Data Elements (CDE) Repository
“designed to provide access to structured human and
machine-readable definitions of data elements.”
https://cde.nlm.nih.gov/home
37. (6) Informed consent
Research studies (in contrast to clinical tests) are under
Institutional Review Board (IRB) jurisdiction.A
researcher must have a research protocol approved,
and one or more consent forms.
Biobanks provide biospecimens that are sometimes in
the realm of human subjects research.Appropriate
consent forms must be administered for biospecimens
to be deposited in a biobank.
An emerging issue is obtaining appropriate consent for
DNA to be sequenced from biospecimens. Because of
the nature of contemporary sequencing samples are no
longer inherently deidentifiable.
38. (7) Needs and opportunities
We need resources and efforts such as the following:
• coordination of biobanking efforts across diverse
initiatives.
• awareness and adoption of community standards for
biobanking.
• flexibility to adapt to changing technologies (e.g.
sequencing technologies).
• Integrated platforms and bioinformatics tools