D2S1338 has several population or allele specific SNPs:
- G-T 4bp 3' is well distributed across populations and associated with multiple alleles
- Additional SNPs are population specific or rare
Overall these SNPs increase the number of associated D2S1338 alleles
This document summarizes a presentation on next generation sequencing (NGS) technologies for forensic DNA analysis. It discusses multiplex assays available for sequencing short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) and compares their performance in terms of number of loci, allele coverage, and heterozygote balance. It also describes how NGS can provide additional genetic information at STR loci through sequencing of flanking regions and resolution of mixtures. NGS enables high-resolution ancestry and phenotype prediction from SNP data.
Jan2015 GIAB intro, Update, and Data Analysis PlanningGenomeInABottle
The Genome in a Bottle Consortium is developing reference materials, methods, and data to evaluate accuracy of human genome sequencing. The goal is to enable validation and regulation of clinical genome sequencing. For their pilot genome, NA12878, they have generated high-confidence variant calls that are being used for benchmarking. Moving forward they plan to analyze and develop reference materials from trios in the Personal Genome Project, starting with an Ashkenazi Jewish trio.
Jan2015 using the pilot genome rm for clinical validation steve lincolnGenomeInABottle
The GIAB reference material (NA12878) is useful for validating diagnostic genetic tests but has some limitations. It provided many single nucleotide variants for analysis but few examples of more complex and clinically relevant variants like large indels, copy number variants, and variants in difficult regions. While it demonstrated high accuracy for common variants, it did not adequately represent the full spectrum of variant types seen in clinical cases. More data is needed from a wider diversity of samples with more engineered variants to fully validate tests for their ability to detect pathogenic variants.
Aug2013 NIST highly confident genotype calls for NA12878GenomeInABottle
This document discusses the creation of a set of highly confident genotypes for the genome NA12878 by integrating data from 12 datasets across 5 sequencing platforms. The goals were to define regions of the genome with no false positive or negative genotype calls, include as much of the genome as possible, and avoid biases. Candidate variants were identified and datasets were removed that showed characteristics of sequencing, mapping, or alignment biases until all datasets agreed. Regions like repeats and duplications were labeled uncertain. Verification showed the calls were highly accurate except for some systematic errors and complex variants. The data and calls are available to help assess performance of variant calling.
Recent breakthroughs in genome editing technology have led to a rapid adoption that parallels that seen with RNAi. And like RNAi, these methods are taking the scientific world by storm, with high profile publications in fields as diverse as HIV treatment, stem cell therapy, food crop modification and drug development to name but a few.
Critically, the endogenous modification of genes enables the study of their function in a physiological context. It also overcomes some of the artefacts that can result from established techniques such as transgenesis and RNAi, which have mislead researchers with false positives or negatives. Until recently however genome editing required considerable technical expertise, and consequently was a relatively niche pursuit.
In this talk we will look at how the latest developments in genome editing tools have changed this, with improvements in both ease-of-use and targeting efficiency, as well as a concomitant reduction in costs opening up these approaches to the wider scientific community.
Rapid adoption of the CRISPR/Cas9 system has for example led to a long list of organisms and tissues in which genetic changes have been made with high efficiency. Other technologies such as recombinant adeno-associated virus (rAAV) offer further precision, stimulating the cell’s high-fidelity DNA repair pathways to insert exogenous sequence with unrivalled specificity. Targeting efficiency can be improved still further by using the technologies in combination – genome cutting induced by CRISPR can significantly enhance homologous recombination mediated by rAAV.
Despite these rapid advances, some pitfalls remain, and so we’ll discuss some of the key considerations for avoiding these, ranging from simply picking the right tool for the job to designing an experiment that maximises chances of success.
Finally we’ll look at how genome editing is being applied to both basic and translational research, and in both a gene-specific and genome wide manner. For the study of disease associated genes and mutations scientists can now complement wide panels of tumour cells with genetically defined isogenic cell pairs identical in all but precise modifications in their gene of interest. The ease-of-design and efficiency of the CRISPR system is also being exploited for genome wide synthetic lethality screens, facilitating rapid drug target identification with significantly reduced risk of false negatives and off-target false positives. And again, further synergies are achieved when these approaches are combined to look for potential synthetic lethal targets in specific genomic contexts.
The key considerations of crispr genome editingChris Thorne
While CRISPR is simple to use, widely applicable and often highly efficient, there are a number of things to keep in mind to maximise experimental success. Here's what we recommend...
ICMP MPS SNP Panel for Missing Persons - Michelle Peck et al.QIAGEN
Optimization and Performance of a Very Large MGS SNP Panel for Missing Persons, by Michelle Peck et al., International Commission on Mission Persons. Presented May 3, 2018, at the QIAGEN Investigator Forum, San Antonio, TX.
This document summarizes a presentation on next generation sequencing (NGS) technologies for forensic DNA analysis. It discusses multiplex assays available for sequencing short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) and compares their performance in terms of number of loci, allele coverage, and heterozygote balance. It also describes how NGS can provide additional genetic information at STR loci through sequencing of flanking regions and resolution of mixtures. NGS enables high-resolution ancestry and phenotype prediction from SNP data.
Jan2015 GIAB intro, Update, and Data Analysis PlanningGenomeInABottle
The Genome in a Bottle Consortium is developing reference materials, methods, and data to evaluate accuracy of human genome sequencing. The goal is to enable validation and regulation of clinical genome sequencing. For their pilot genome, NA12878, they have generated high-confidence variant calls that are being used for benchmarking. Moving forward they plan to analyze and develop reference materials from trios in the Personal Genome Project, starting with an Ashkenazi Jewish trio.
Jan2015 using the pilot genome rm for clinical validation steve lincolnGenomeInABottle
The GIAB reference material (NA12878) is useful for validating diagnostic genetic tests but has some limitations. It provided many single nucleotide variants for analysis but few examples of more complex and clinically relevant variants like large indels, copy number variants, and variants in difficult regions. While it demonstrated high accuracy for common variants, it did not adequately represent the full spectrum of variant types seen in clinical cases. More data is needed from a wider diversity of samples with more engineered variants to fully validate tests for their ability to detect pathogenic variants.
Aug2013 NIST highly confident genotype calls for NA12878GenomeInABottle
This document discusses the creation of a set of highly confident genotypes for the genome NA12878 by integrating data from 12 datasets across 5 sequencing platforms. The goals were to define regions of the genome with no false positive or negative genotype calls, include as much of the genome as possible, and avoid biases. Candidate variants were identified and datasets were removed that showed characteristics of sequencing, mapping, or alignment biases until all datasets agreed. Regions like repeats and duplications were labeled uncertain. Verification showed the calls were highly accurate except for some systematic errors and complex variants. The data and calls are available to help assess performance of variant calling.
Recent breakthroughs in genome editing technology have led to a rapid adoption that parallels that seen with RNAi. And like RNAi, these methods are taking the scientific world by storm, with high profile publications in fields as diverse as HIV treatment, stem cell therapy, food crop modification and drug development to name but a few.
Critically, the endogenous modification of genes enables the study of their function in a physiological context. It also overcomes some of the artefacts that can result from established techniques such as transgenesis and RNAi, which have mislead researchers with false positives or negatives. Until recently however genome editing required considerable technical expertise, and consequently was a relatively niche pursuit.
In this talk we will look at how the latest developments in genome editing tools have changed this, with improvements in both ease-of-use and targeting efficiency, as well as a concomitant reduction in costs opening up these approaches to the wider scientific community.
Rapid adoption of the CRISPR/Cas9 system has for example led to a long list of organisms and tissues in which genetic changes have been made with high efficiency. Other technologies such as recombinant adeno-associated virus (rAAV) offer further precision, stimulating the cell’s high-fidelity DNA repair pathways to insert exogenous sequence with unrivalled specificity. Targeting efficiency can be improved still further by using the technologies in combination – genome cutting induced by CRISPR can significantly enhance homologous recombination mediated by rAAV.
Despite these rapid advances, some pitfalls remain, and so we’ll discuss some of the key considerations for avoiding these, ranging from simply picking the right tool for the job to designing an experiment that maximises chances of success.
Finally we’ll look at how genome editing is being applied to both basic and translational research, and in both a gene-specific and genome wide manner. For the study of disease associated genes and mutations scientists can now complement wide panels of tumour cells with genetically defined isogenic cell pairs identical in all but precise modifications in their gene of interest. The ease-of-design and efficiency of the CRISPR system is also being exploited for genome wide synthetic lethality screens, facilitating rapid drug target identification with significantly reduced risk of false negatives and off-target false positives. And again, further synergies are achieved when these approaches are combined to look for potential synthetic lethal targets in specific genomic contexts.
The key considerations of crispr genome editingChris Thorne
While CRISPR is simple to use, widely applicable and often highly efficient, there are a number of things to keep in mind to maximise experimental success. Here's what we recommend...
ICMP MPS SNP Panel for Missing Persons - Michelle Peck et al.QIAGEN
Optimization and Performance of a Very Large MGS SNP Panel for Missing Persons, by Michelle Peck et al., International Commission on Mission Persons. Presented May 3, 2018, at the QIAGEN Investigator Forum, San Antonio, TX.
An Introduction to Crispr Genome EditingChris Thorne
In this short presentation, I make a case for doing genome editing vs some of the approaches that have gone before, describe some of the tools available, and the focus on CRISPR-Cas9, what it is, where it's come from and how it works.
The document discusses challenges in characterizing structural variations across genomes and populations using long-read sequencing technologies. It describes tools developed for accurate mapping and structural variant calling from long reads, including NGMLR for mapping and Sniffles for variant calling. It also presents results of applying these tools to call structural variants in the NA12878 genome from PacBio and Nanopore long reads, detecting many more variants than from short read data. The goal is to better understand variability between genomes and impact on gene regulation.
GENESIS™: Comprehensive genome editing - Translating genetic information into personalised medicines.
Horizon is the only source of rAAV expertise and is uniquely capable of exploiting multiple platforms: CRISPR, ZFNs and rAAV singularly or combined. Horizon’s scientists are experts at all forms of gene editing and so have the experience to help guide customers towards the approach that best suits their project
Molecular QC: Interpreting your Bioinformatics PipelineCandy Smellie
What is the impact of assay failure in your laboratory and how do you monitor for it?
The most heavily degraded samples are not suitable for standard exome coverage: sometimes it’s not even a matter of getting bad sequencing, you might get nothing at all!
FFPE artifacts increase with storage time
Artifacts go against the statistical power of your variant calling analysis
Molecular reference standards help filter out bad mappings and spurious variants
Bioinformatics pipelines allow adding Molecular Reference Standards in your joint variant calling pipeline
Genome In A Bottle Reference Standards are invaluable for validating variant calling analysis
NIST and its collaborators shared datasets created with most NGS technologies
Horizon Diagnostics shared annotated, merged variant calls from NIST for the Ashkenazim Trio
~35K variants are predicted having high or moderate impact within the Trio
GM24385 (Ashkenazim Son) includes 352 small variants with high/moderate impact which are absent in Father and Mother
Routinely monitor the performance of your workflows and assays with independent external controls
Research Program Genetic Gains (RPGG) Review Meeting 2021: Groundnut genomic ...ICRISAT
These high quality genomes are global resource and are being used by all the genomics and breeding researchers across the world including ICRISAT. High density genotyping assays developed and currently been deployed for generating high throughput and high density genotyping data on germplasm and breeding lines.
Genome Editing Comes of Age; CRISPR, rAAV and the new landscape of molecular ...Candy Smellie
Information is no longer a bottleneck, emphasis is shifting to the ‘what does it all mean’
In a translational context we hope that by answering that question we will be able to is to characterise the genetics that drive disease, and indeed develop drugs and diagnostics that are personalised to patients.
Genome editing provides the link between the information here, and this outcome here, by allowing scientists to recapitulate specific genetic alterations in any gene in any living tissue to probe function, develop disease models and identify therapeutic strategies. So, not only do we now have unparalleled access to genetic information, but we now have the tools to most accuartely understand what this genetic information – with genome editing allowing us to explore the genetic drivers of disease in physiological models.
AAV is a single-stranded, linear DNA virus with a a 4.7 kb genome which for the purpose of genome editing is replaced almost in entirety with the targeting vector sequence (except for the iTRs)
It is in effect a highly effective DNA delivery mechanism
After entry of the vector into the cell, target-specific homologous DNA is believed to activate and recruit HR-dependent repair factors can induce HR at rates approximately 1,000 times greater than plasmid based double stranded DNA vectors, but the mechanism by which it achieves this is still largely unknown
By including a selection cassette can select for cells that have integrated the targeting vector, and then screen for clones which have undergone targeted insetion rather than random integration, which will generally be around 1%.
1) The document outlines research using genomics tools like GBS and DArT analysis to advance yam breeding for traits like yield and quality. Over 900 yam accessions were analyzed to identify genetic diversity and population structure.
2) Highly polymorphic SNP markers were validated and used in GWAS to identify genomic regions associated with traits. Bi-parental populations were also analyzed to identify QTLs.
3) Molecular markers were developed for early prediction of flower sex, an important trait for yam breeding. The research aims to apply markers to accelerate yam breeding through genomic selection.
Presentation by Justin Zook at GRC/GIAB ASHG 2017 workshop "Getting the most from the reference assembly and reference materials" on benchmarks for indels and structural variants.
This document discusses challenges in using the GRCh38 human reference genome assembly from the perspective of a company that provides clinical genomic sequencing services. It describes how the company used incremental steps including alignment fix patches to improve migration from GRCh37 to GRCh38. It also explains difficulties in fully representing genes and variants located across multiple reference locations or assemblies within common data formats.
Alzheimer’s disease (AD) is a devastating neurodegenerative disease that is genetically complex. Although great progress has been made in identifying fully penetrant mutations in genes that cause early-onset AD, these still represent a very small percentage of AD cases. Large-scale, genome-wide association studies (GWAS) have identified at least 20 additional genetic risk loci for the more common form: late-onset AD. However, the identified SNPs are typically not the actual risk variants, but are in linkage disequilibrium with the presumed causative variants [1].
To help identify causative genetic variants, we have combined highly accurate, long-read sequencing with hybrid-capture technology. In this collaborative webinar*, we present this method and show how combining IDT xGen® Lockdown® Probes with PacBio SMRT® Sequencing allows targeting and sequencing of candidate genes from genomic DNA and corresponding transcripts from cDNA. Using a panel of target capture probes for 35 AD candidate genes, we demonstrate the power of this approach by looking at data for two individuals with AD. Some additional benefits of this method include the ability to leverage long reads, phase heterozygous variants, and link corresponding transcript isoforms to their respective alleles.
Reference: 1. Van Cauwenberghe C, Van Broeckhoven C, Sleegers K. (2016) The genetic landscape of Alzheimer disease: clinical implications and perspectives. Genet Med, 18(5):421–430.
* This presentation represents a collaboration between Pacific Biosciences and Integrated DNA Technologies. The individual opinions expressed may not reflect shared opinions of Pacific Biosciences and Integrated DNA Technologies.
Dr. Chris Lowe presented on Horizon Discovery's precision genome editing platform called GENESISTM. The presentation discussed optimizing GENESISTM by combining CRISPR and rAAV technologies to improve gene targeting efficiency. Custom cell line development services are offered to modify genes of interest in various cell lines for applications such as generating disease models and studying drug sensitivity. Key considerations for successful gene editing experiments include factors like gene/cell line selection, gRNA design/activity, donor design, screening/validation approaches. Case studies demonstrated applications of engineered cell lines.
Why Y Chromosome Markers are an Ever Expanding Essential Tool in Sexual Assau...Thermo Fisher Scientific
Why Y Chromosome Markers are an Ever Expanding Essential Tool in Sexual Assault Investigations
Jack Ballantyne, Erin Hanson
National Center for Forensic Science, UCF
Sexual Assault by the numbers*:
60% of sexual assaults are not reported to the police.
97% of rapists never spend a day in jail.
There are an average of 237,868 victims of rape and sexual assault each year.
Every 2 minutes, another American is sexually assaulted.
1 out of every 5 American women have been the victim of an attempted or completed rape in her lifetime.
9 out of 10 rape victims are female.
*Source: Rape, Abuse & Incest National Network (RAINN)
In the US, an estimated 19.3% of women and 1.7% of men have been raped during their lifetimes**
**-CDC Sept 2014
NCFS Evaluation of Quantifiler® Trio / Yfiler® Plus:
Can we improve our analysis of extended interval post coital samples
using the integrated/modular Quantifiler® Trio/Yfiler® Plus system?
Conclusions:
Quantitation System
• Excellent correlation between male quantitation and profile recovery
• Negative quant (threshold based) means negative/unusable Yfiler® Plus kit
• Observed autosomal/Y results consistent with expected based upon quant and F/M ratio
Yfiler® Plus kit
• Sensitive chemistry
• If >100 pg male, most full profiles
• Results obtained with high quantities of background female DNA (up to mg)
• Many ‘Usable profiles’ F/M ranging from 333:1-100,000:1
• Y-STR profiles using YFP can be obtained at least up to 9 days after intercourse
The CRISPR/Cas9 system has emerged as one of the leading tools for modifying genomes of organisms ranging from E. coli to humans. Additionally, the simple gene targeting mechanism of CRISPR technology has been modified and adapted to other applications that include gene regulation, detection of intercellular trafficking, and pathogen detection. With a wealth of methods for introducing Cas9 and gRNAs into cells, it can be challenging to decide where to start. In this presentation, Dr Adam Clore describes the CRISPR mechanism and some of the most prominent uses for CRISPR, along with methods where IDT technologies can assist scientists in designing, testing, and executing a variety of CRISPR-mediated experiments. For more informaton, visit: http://www.idtdna.com/crispr
This document provides an overview of genome editing techniques such as CRISPR/Cas9 and rAAV and considerations for their use. It discusses how CRISPR/Cas9 and rAAV work to edit genomes and compares their advantages. Key factors for CRISPR gene editing are discussed such as gRNA design, donor design, and screening/validation approaches. The document also summarizes research optimizing CRISPR gene editing through improvements like testing different donor lengths and modifications. The goal is to translate genetic information into personalized medicines by leveraging tools like CRISPR and rAAV.
Speaker: Benedict C. S. Cross, PhD, Team leader (Discovery Screening), Horizon Discovery
CRISPR–Cas9 mediated genome editing provides a highly efficient way to probe gene function. Using this technology, thousands of genes can be knocked out and their function assessed in a single experiment. We have conducted over 150 of these complex and powerful screens and will use our experience to guide you through the process of screen design, performance and analysis.
We'll be discussing:
• How to use CRISPR screening for target ID and validation, understanding drug MOA and patient stratification
• The screen design, quality control and how to evaluate success of your screening program
• Horizon’s latest developments to the platform
• Horizon’s novel approaches to target validation screening
This document discusses sequencing a 17-member pedigree including individual NA12878 to 50x depth using different variant calling software and technologies. Sequencing a large pedigree allows unambiguous determination of parental chromosome transmission and identifies errors in over 99.7% of variant positions with 11 siblings. While adding more siblings increases confidence in variant calls and phasing, it also increases costs. Somatic deletions were found in NA12878 and another individual, but not carried by other children. Technical replicates validated over 96% of de novo SNV calls in NA12878. Moving forward, selecting pedigrees with multiple siblings and one high quality family over lower quality trios would provide the best reference, while long reads on one or two samples could
This document summarizes work on generating haplotype phased reference genomes for the wheat stripe rust fungus Puccinia striiformis f. sp. tritici. Key points:
1) Long-read PacBio sequencing was used to generate improved genome assemblies with fewer contigs and the ability to distinguish between the two haplotypes of the dikaryotic fungus.
2) Mapping of the assemblies showed distinct sequences corresponding to the two haplotypes.
3) Future work includes manual curation of the genome assembly, annotating genes and repeats, and investigating the interaction between the two fungal nuclei.
Next Generation Sequencing application in virologyEben Titus
Next Generation Sequencing (NGS) is a promising technique for virus diagnosis that provides several advantages over traditional Sanger sequencing. NGS workflows involve sample preparation, sequencing, and data analysis. NGS has various applications in virology including identifying viral quasispecies, detecting antiviral drug resistance mutations, discovering novel virus genotypes, and performing quality control of live vaccines. While NGS reduces costs and improves throughput over Sanger sequencing, analyzing large NGS datasets requires strong bioinformatics skills. Overall, NGS represents a significant improvement for virus research and diagnosis.
Genome Wide SNPs for Admixture Analysis and Selection Signaturesfirdous ahmad
The presentation presents basic concepts and describes various advanced uses of Genome Wide SNP markers for Admixture Analysis and Selection Signatures. The usage of SNP markers has made tremendous progress in recent times and it shall help in application of genomic selection in developing countries including India
Dr. Igor Paploski - Making Epidemiological Sense Out of Large Datasets of PRR...John Blue
Making Epidemiological Sense Out of Large Datasets of PRRS Sequences - Dr. Igor Paploski, from the 2018 Allen D. Leman Swine Conference, September 15-18, 2018, St. Paul, Minnesota, USA.
More presentations at http://www.swinecast.com/2018-leman-swine-conference-material
An Introduction to Crispr Genome EditingChris Thorne
In this short presentation, I make a case for doing genome editing vs some of the approaches that have gone before, describe some of the tools available, and the focus on CRISPR-Cas9, what it is, where it's come from and how it works.
The document discusses challenges in characterizing structural variations across genomes and populations using long-read sequencing technologies. It describes tools developed for accurate mapping and structural variant calling from long reads, including NGMLR for mapping and Sniffles for variant calling. It also presents results of applying these tools to call structural variants in the NA12878 genome from PacBio and Nanopore long reads, detecting many more variants than from short read data. The goal is to better understand variability between genomes and impact on gene regulation.
GENESIS™: Comprehensive genome editing - Translating genetic information into personalised medicines.
Horizon is the only source of rAAV expertise and is uniquely capable of exploiting multiple platforms: CRISPR, ZFNs and rAAV singularly or combined. Horizon’s scientists are experts at all forms of gene editing and so have the experience to help guide customers towards the approach that best suits their project
Molecular QC: Interpreting your Bioinformatics PipelineCandy Smellie
What is the impact of assay failure in your laboratory and how do you monitor for it?
The most heavily degraded samples are not suitable for standard exome coverage: sometimes it’s not even a matter of getting bad sequencing, you might get nothing at all!
FFPE artifacts increase with storage time
Artifacts go against the statistical power of your variant calling analysis
Molecular reference standards help filter out bad mappings and spurious variants
Bioinformatics pipelines allow adding Molecular Reference Standards in your joint variant calling pipeline
Genome In A Bottle Reference Standards are invaluable for validating variant calling analysis
NIST and its collaborators shared datasets created with most NGS technologies
Horizon Diagnostics shared annotated, merged variant calls from NIST for the Ashkenazim Trio
~35K variants are predicted having high or moderate impact within the Trio
GM24385 (Ashkenazim Son) includes 352 small variants with high/moderate impact which are absent in Father and Mother
Routinely monitor the performance of your workflows and assays with independent external controls
Research Program Genetic Gains (RPGG) Review Meeting 2021: Groundnut genomic ...ICRISAT
These high quality genomes are global resource and are being used by all the genomics and breeding researchers across the world including ICRISAT. High density genotyping assays developed and currently been deployed for generating high throughput and high density genotyping data on germplasm and breeding lines.
Genome Editing Comes of Age; CRISPR, rAAV and the new landscape of molecular ...Candy Smellie
Information is no longer a bottleneck, emphasis is shifting to the ‘what does it all mean’
In a translational context we hope that by answering that question we will be able to is to characterise the genetics that drive disease, and indeed develop drugs and diagnostics that are personalised to patients.
Genome editing provides the link between the information here, and this outcome here, by allowing scientists to recapitulate specific genetic alterations in any gene in any living tissue to probe function, develop disease models and identify therapeutic strategies. So, not only do we now have unparalleled access to genetic information, but we now have the tools to most accuartely understand what this genetic information – with genome editing allowing us to explore the genetic drivers of disease in physiological models.
AAV is a single-stranded, linear DNA virus with a a 4.7 kb genome which for the purpose of genome editing is replaced almost in entirety with the targeting vector sequence (except for the iTRs)
It is in effect a highly effective DNA delivery mechanism
After entry of the vector into the cell, target-specific homologous DNA is believed to activate and recruit HR-dependent repair factors can induce HR at rates approximately 1,000 times greater than plasmid based double stranded DNA vectors, but the mechanism by which it achieves this is still largely unknown
By including a selection cassette can select for cells that have integrated the targeting vector, and then screen for clones which have undergone targeted insetion rather than random integration, which will generally be around 1%.
1) The document outlines research using genomics tools like GBS and DArT analysis to advance yam breeding for traits like yield and quality. Over 900 yam accessions were analyzed to identify genetic diversity and population structure.
2) Highly polymorphic SNP markers were validated and used in GWAS to identify genomic regions associated with traits. Bi-parental populations were also analyzed to identify QTLs.
3) Molecular markers were developed for early prediction of flower sex, an important trait for yam breeding. The research aims to apply markers to accelerate yam breeding through genomic selection.
Presentation by Justin Zook at GRC/GIAB ASHG 2017 workshop "Getting the most from the reference assembly and reference materials" on benchmarks for indels and structural variants.
This document discusses challenges in using the GRCh38 human reference genome assembly from the perspective of a company that provides clinical genomic sequencing services. It describes how the company used incremental steps including alignment fix patches to improve migration from GRCh37 to GRCh38. It also explains difficulties in fully representing genes and variants located across multiple reference locations or assemblies within common data formats.
Alzheimer’s disease (AD) is a devastating neurodegenerative disease that is genetically complex. Although great progress has been made in identifying fully penetrant mutations in genes that cause early-onset AD, these still represent a very small percentage of AD cases. Large-scale, genome-wide association studies (GWAS) have identified at least 20 additional genetic risk loci for the more common form: late-onset AD. However, the identified SNPs are typically not the actual risk variants, but are in linkage disequilibrium with the presumed causative variants [1].
To help identify causative genetic variants, we have combined highly accurate, long-read sequencing with hybrid-capture technology. In this collaborative webinar*, we present this method and show how combining IDT xGen® Lockdown® Probes with PacBio SMRT® Sequencing allows targeting and sequencing of candidate genes from genomic DNA and corresponding transcripts from cDNA. Using a panel of target capture probes for 35 AD candidate genes, we demonstrate the power of this approach by looking at data for two individuals with AD. Some additional benefits of this method include the ability to leverage long reads, phase heterozygous variants, and link corresponding transcript isoforms to their respective alleles.
Reference: 1. Van Cauwenberghe C, Van Broeckhoven C, Sleegers K. (2016) The genetic landscape of Alzheimer disease: clinical implications and perspectives. Genet Med, 18(5):421–430.
* This presentation represents a collaboration between Pacific Biosciences and Integrated DNA Technologies. The individual opinions expressed may not reflect shared opinions of Pacific Biosciences and Integrated DNA Technologies.
Dr. Chris Lowe presented on Horizon Discovery's precision genome editing platform called GENESISTM. The presentation discussed optimizing GENESISTM by combining CRISPR and rAAV technologies to improve gene targeting efficiency. Custom cell line development services are offered to modify genes of interest in various cell lines for applications such as generating disease models and studying drug sensitivity. Key considerations for successful gene editing experiments include factors like gene/cell line selection, gRNA design/activity, donor design, screening/validation approaches. Case studies demonstrated applications of engineered cell lines.
Why Y Chromosome Markers are an Ever Expanding Essential Tool in Sexual Assau...Thermo Fisher Scientific
Why Y Chromosome Markers are an Ever Expanding Essential Tool in Sexual Assault Investigations
Jack Ballantyne, Erin Hanson
National Center for Forensic Science, UCF
Sexual Assault by the numbers*:
60% of sexual assaults are not reported to the police.
97% of rapists never spend a day in jail.
There are an average of 237,868 victims of rape and sexual assault each year.
Every 2 minutes, another American is sexually assaulted.
1 out of every 5 American women have been the victim of an attempted or completed rape in her lifetime.
9 out of 10 rape victims are female.
*Source: Rape, Abuse & Incest National Network (RAINN)
In the US, an estimated 19.3% of women and 1.7% of men have been raped during their lifetimes**
**-CDC Sept 2014
NCFS Evaluation of Quantifiler® Trio / Yfiler® Plus:
Can we improve our analysis of extended interval post coital samples
using the integrated/modular Quantifiler® Trio/Yfiler® Plus system?
Conclusions:
Quantitation System
• Excellent correlation between male quantitation and profile recovery
• Negative quant (threshold based) means negative/unusable Yfiler® Plus kit
• Observed autosomal/Y results consistent with expected based upon quant and F/M ratio
Yfiler® Plus kit
• Sensitive chemistry
• If >100 pg male, most full profiles
• Results obtained with high quantities of background female DNA (up to mg)
• Many ‘Usable profiles’ F/M ranging from 333:1-100,000:1
• Y-STR profiles using YFP can be obtained at least up to 9 days after intercourse
The CRISPR/Cas9 system has emerged as one of the leading tools for modifying genomes of organisms ranging from E. coli to humans. Additionally, the simple gene targeting mechanism of CRISPR technology has been modified and adapted to other applications that include gene regulation, detection of intercellular trafficking, and pathogen detection. With a wealth of methods for introducing Cas9 and gRNAs into cells, it can be challenging to decide where to start. In this presentation, Dr Adam Clore describes the CRISPR mechanism and some of the most prominent uses for CRISPR, along with methods where IDT technologies can assist scientists in designing, testing, and executing a variety of CRISPR-mediated experiments. For more informaton, visit: http://www.idtdna.com/crispr
This document provides an overview of genome editing techniques such as CRISPR/Cas9 and rAAV and considerations for their use. It discusses how CRISPR/Cas9 and rAAV work to edit genomes and compares their advantages. Key factors for CRISPR gene editing are discussed such as gRNA design, donor design, and screening/validation approaches. The document also summarizes research optimizing CRISPR gene editing through improvements like testing different donor lengths and modifications. The goal is to translate genetic information into personalized medicines by leveraging tools like CRISPR and rAAV.
Speaker: Benedict C. S. Cross, PhD, Team leader (Discovery Screening), Horizon Discovery
CRISPR–Cas9 mediated genome editing provides a highly efficient way to probe gene function. Using this technology, thousands of genes can be knocked out and their function assessed in a single experiment. We have conducted over 150 of these complex and powerful screens and will use our experience to guide you through the process of screen design, performance and analysis.
We'll be discussing:
• How to use CRISPR screening for target ID and validation, understanding drug MOA and patient stratification
• The screen design, quality control and how to evaluate success of your screening program
• Horizon’s latest developments to the platform
• Horizon’s novel approaches to target validation screening
This document discusses sequencing a 17-member pedigree including individual NA12878 to 50x depth using different variant calling software and technologies. Sequencing a large pedigree allows unambiguous determination of parental chromosome transmission and identifies errors in over 99.7% of variant positions with 11 siblings. While adding more siblings increases confidence in variant calls and phasing, it also increases costs. Somatic deletions were found in NA12878 and another individual, but not carried by other children. Technical replicates validated over 96% of de novo SNV calls in NA12878. Moving forward, selecting pedigrees with multiple siblings and one high quality family over lower quality trios would provide the best reference, while long reads on one or two samples could
This document summarizes work on generating haplotype phased reference genomes for the wheat stripe rust fungus Puccinia striiformis f. sp. tritici. Key points:
1) Long-read PacBio sequencing was used to generate improved genome assemblies with fewer contigs and the ability to distinguish between the two haplotypes of the dikaryotic fungus.
2) Mapping of the assemblies showed distinct sequences corresponding to the two haplotypes.
3) Future work includes manual curation of the genome assembly, annotating genes and repeats, and investigating the interaction between the two fungal nuclei.
Next Generation Sequencing application in virologyEben Titus
Next Generation Sequencing (NGS) is a promising technique for virus diagnosis that provides several advantages over traditional Sanger sequencing. NGS workflows involve sample preparation, sequencing, and data analysis. NGS has various applications in virology including identifying viral quasispecies, detecting antiviral drug resistance mutations, discovering novel virus genotypes, and performing quality control of live vaccines. While NGS reduces costs and improves throughput over Sanger sequencing, analyzing large NGS datasets requires strong bioinformatics skills. Overall, NGS represents a significant improvement for virus research and diagnosis.
Genome Wide SNPs for Admixture Analysis and Selection Signaturesfirdous ahmad
The presentation presents basic concepts and describes various advanced uses of Genome Wide SNP markers for Admixture Analysis and Selection Signatures. The usage of SNP markers has made tremendous progress in recent times and it shall help in application of genomic selection in developing countries including India
Dr. Igor Paploski - Making Epidemiological Sense Out of Large Datasets of PRR...John Blue
Making Epidemiological Sense Out of Large Datasets of PRRS Sequences - Dr. Igor Paploski, from the 2018 Allen D. Leman Swine Conference, September 15-18, 2018, St. Paul, Minnesota, USA.
More presentations at http://www.swinecast.com/2018-leman-swine-conference-material
This document summarizes the work of the Genome in a Bottle Consortium to develop reference materials for evaluating the accuracy of whole genome sequencing. It finds that different sequencing platforms and bioinformatics programs often disagree on called variants. The Consortium is developing well-characterized human genomes that will serve as standards, including the genome of individual NA12878. The goal is to assign confidence levels to called variants and provide integrated variant calls across multiple datasets to help benchmark sequencing accuracy and evaluate sequencing performance in different genomic and clinical contexts.
A SNP array for human population genetics studiesAffymetrix
Yontao Lu, Teri Genschoreck, Swapan Mallick, Amy Ollmann, Nick Patterson, Yiping Zhan, Teresa Webster, David Reich Overview of the Human Origins Array, the first array developed with leading geneticists to enable rigorous population genetics studies.
1. The study performed whole-exome sequencing on 72 cases of familial Meniere's disease to identify rare variants in genes associated with sensorineural hearing loss (SNHL).
2. Analysis identified that 40% of cases carried at least one novel or ultra-rare variant in SNHL genes, suggesting these genes play a role in Meniere's disease.
3. The OTOF gene showed a significant accumulation of rare variants compared to controls. A common intronic variant in OTOF, rs34796712, which regulates its expression, occurred less frequently in cases also carrying a rare variant, suggesting it may play a protective role.
Forensics: Human Identity Testing in the Applied Genetics GroupNathan Olson
"Forensics: Human Identity Testing in the Applied Genetics Group" presentation at the Standards for Pathogen Identification via NGS (SPIN) workshop hosted by the National Institute for Standards and Technology October 2014 by Peter Vallone, PhD from NIST.
This document discusses neural progenitor cells (NPCs), which are potent models for studying normal and diseased neurobiology. The author describes how ATCC has developed NPC lines from various sources including induced pluripotent stem cells. NPCs can be expanded and differentiated into neurons, astrocytes and oligodendrocytes. ATCC has optimized culture conditions for NPC growth and differentiation. Reporter NPC lines expressing GFP, NanoLuc, or HaloTag have also been generated. Initial NPC offerings from ATCC are outlined. In summary, ATCC provides a complete solution of NPCs and culture media for disease modeling and drug screening applications.
This document describes an analysis pipeline for gene expression data. It involves steps like raw data normalization, statistics, differential gene expression analysis, and profile interpretation. It also discusses using the Affymetrix analysis system and mouse 430A chip for a nephridium experiment comparing different time points and genotypes. Multivariate analysis techniques like 2-way ANOVA and PCA are applied. The goal is to build systems biology analysis platforms integrating clinical records, biological databases, and dynamic databases for knowledge discovery.
This document summarizes a presentation given at the Arabidopsis Information Resource (TAIR) workshop. It provides an overview of TAIR, including its role as a curated resource for Arabidopsis thaliana research. Key points include that TAIR integrates experimental data from over 33,000 A. thaliana genes, provides manual literature curation of gene functions, and offers various search and analysis tools for researchers. The presentation also demonstrates how TAIR can be leveraged to study gene functions in other plant species through sequence similarity searches and comparative genomics analyses.
- The document describes testing various bioinformatics tools wrapped in the Metapy analysis software for metabarcoding of Phytophthora species, including Bowtie, Swarm, CD-HIT, DADA2, BLASTCLUST, and VSEARCH.
- Control mixes with known Phytophthora species were analyzed to determine sensitivity, precision, false discovery and negative rates. No single tool performed best across all metrics. Bowtie had the highest accuracy due to reliance on a reference database, while Swarm had a low false positive rate.
- Error rates were examined using synthetic DNA controls which showed most variation was within 1-2 mismatches of the original sequences, suggesting thresholds less than 3 mismatches are needed.
Massively Parallel Sequencing - integrating the Ion PGM™ sequencer into your ...Thermo Fisher Scientific
This document summarizes the integration of massively parallel sequencing (MPS) using the Ion PGMTM sequencer into a forensic laboratory. The project aims to begin transforming STR profiling to genomic technologies, add additional SNP markers in a single workflow, and enable non-human DNA testing. Initial results show sequencing of amplified STR products is possible but alignment is challenging. A custom panel of 280 targets including STRs, SNPs, and amelogenin was also tested with most targets detected across samples. Ongoing work focuses on improving sensitivity, reproducibility, and analyzing mixed samples. Implementation of MPS as a routine forensic service is estimated within 3-5 years.
This document discusses applications of forensic genetics using Y-STR DNA analysis. It begins by describing characteristics of the human Y chromosome, including its size, gene content, and lack of recombination. It then discusses Y-STR polymorphisms and markers used in forensic kits. Examples are given of cases where Y-STR analysis was used, such as analyzing mixed male-female samples or determining paternity. Guidelines for interpretation and population studies using the PowerPlex Y kit are discussed. Common issues like sensitivity, mixtures, and anomalies are also summarized.
This document discusses applications of forensic genetics using Y-STR DNA analysis. It begins by describing characteristics of the human Y chromosome, including its size, gene content, and lack of recombination. It then discusses Y-STR polymorphisms and markers used in forensic kits. Examples are given of cases where Y-STR analysis was used, such as analyzing mixed male-female samples or determining paternity. Guidelines for interpretation and population studies using the PowerPlex Y kit are discussed. Common issues like sensitivity, mixtures, and anomalies are also summarized.
Similar to The Next Dimension in STR Sequencing: Polymorphisms in Flanking Regions and Their Allelic Associations (13)
Nucleophilic Addition of carbonyl compounds.pptxSSR02
Nucleophilic addition is the most important reaction of carbonyls. Not just aldehydes and ketones, but also carboxylic acid derivatives in general.
Carbonyls undergo addition reactions with a large range of nucleophiles.
Comparing the relative basicity of the nucleophile and the product is extremely helpful in determining how reversible the addition reaction is. Reactions with Grignards and hydrides are irreversible. Reactions with weak bases like halides and carboxylates generally don’t happen.
Electronic effects (inductive effects, electron donation) have a large impact on reactivity.
Large groups adjacent to the carbonyl will slow the rate of reaction.
Neutral nucleophiles can also add to carbonyls, although their additions are generally slower and more reversible. Acid catalysis is sometimes employed to increase the rate of addition.
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best.
Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level.
Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...AbdullaAlAsif1
The pygmy halfbeak Dermogenys colletei, is known for its viviparous nature, this presents an intriguing case of relatively low fecundity, raising questions about potential compensatory reproductive strategies employed by this species. Our study delves into the examination of fecundity and the Gonadosomatic Index (GSI) in the Pygmy Halfbeak, D. colletei (Meisner, 2001), an intriguing viviparous fish indigenous to Sarawak, Borneo. We hypothesize that the Pygmy halfbeak, D. colletei, may exhibit unique reproductive adaptations to offset its low fecundity, thus enhancing its survival and fitness. To address this, we conducted a comprehensive study utilizing 28 mature female specimens of D. colletei, carefully measuring fecundity and GSI to shed light on the reproductive adaptations of this species. Our findings reveal that D. colletei indeed exhibits low fecundity, with a mean of 16.76 ± 2.01, and a mean GSI of 12.83 ± 1.27, providing crucial insights into the reproductive mechanisms at play in this species. These results underscore the existence of unique reproductive strategies in D. colletei, enabling its adaptation and persistence in Borneo's diverse aquatic ecosystems, and call for further ecological research to elucidate these mechanisms. This study lends to a better understanding of viviparous fish in Borneo and contributes to the broader field of aquatic ecology, enhancing our knowledge of species adaptations to unique ecological challenges.
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
The Next Dimension in STR Sequencing: Polymorphisms in Flanking Regions and Their Allelic Associations
1. THE NEXT DIMENSION IN STR SEQUENCING:
Polymorphisms in Flanking Regions
& their Allelic Associations
Katherine Butler Gettings
Rachel Aponte
Kevin Kiesler
Peter Vallone
September 2, 2015
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
2. Disclaimer
I will mention commercial STR kit names and information, but I am not
attempting to endorse any specific products.
NIST Disclaimer: Certain commercial equipment, instruments and materials
are identified in order to specify experimental procedures as completely as
possible. In no case does such identification imply a recommendation or it
imply that any of the materials, instruments or equipment identified are
necessarily the best available for the purpose.
Information presented does not necessarily represent the official position of
the National Institute of Standards and Technology or the U.S. Department of
Justice.
Our group receives or has received funding from the FBI Laboratory and the
National Institute of Justice.
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
3. NIST Population
Samples
• N=183
•Caucasian - 70
•Hispanic - 45
•African American - 68
Amplification
& Library Prep
•PowerSeq Auto
System (Beta)
•PPFusion Loci
•Illumina TruSeq
HS PCR-Free
Sequencing
•MiSeq
Bioinformatics
•STRait Razor
•ExactID (Battelle)
•CE concordance
Analysis
•Repeat Region
•Flanking Region
•Stutter
Data GenerationNIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
4. N=183
Length
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
5. Forensic STR Sequence Diversity
N=183
Simple Repeats
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
6. 150 bp
flanks
110 bp
flanks
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
7. Categories of Flanking Regions
1. No polymorphisms
2. Polymorphisms Associated with a Sequence Variant
3. Rare polymorphisms
4. Population or Allele Specific polymorphisms
5. “Old” polymorphisms (not population or allele specific)
6. Multiple polymorphisms in Haplotype
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
9. No SNPs were identified
in the sequenced flanking regions of:
D3S1358 Penta E
FGA D19S433
D12S391 D21S11
Category Locus rs Number
African
American
Caucasian Hispanic
African
American
Caucasian Hispanic
D3S1358
FGA
D12S391
Penta E
D19S433
D21S11
Minor Allele Frequency Number of Associated STR Alleles Alleles Gained
with Flanking
SNPs
No SNPs
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
10. Category Locus rs Number
African
American
Caucasian Hispanic
African
American
Caucasian Hispanic
D1S1656 rs4847015 0.191 0.336 0.311 5 5 4 0
rs11063971 0.044 0.086 0.078 1 1 1
rs11063970 0.044 0.086 0.078 1 1 1
rs11063969 0.044 0.086 0.078 1 1 1
rs75219269 0.044 0.086 0.078 1 1 1
rs199970098 0.029 - 0.011 2 - 1
0vWA
SNPs Associated with
Repeat Region
Sequence Variant
Minor Allele Frequency Number of Associated STR Alleles Alleles Gained
with Flanking
SNPs
D1S1656 has one SNP, rs4847015
• Well distributed across populations…
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
11. Category Locus rs Number
African
American
Caucasian Hispanic
African
American
Caucasian Hispanic
D1S1656 rs4847015 0.191 0.336 0.311 5 5 4 0
rs11063971 0.044 0.086 0.078 1 1 1
rs11063970 0.044 0.086 0.078 1 1 1
rs11063969 0.044 0.086 0.078 1 1 1
rs75219269 0.044 0.086 0.078 1 1 1
rs199970098 0.029 - 0.011 2 - 1
0vWA
SNPs Associated with
Repeat Region
Sequence Variant
Minor Allele Frequency Number of Associated STR Alleles Alleles Gained
with Flanking
SNPs
D1S1656 has one SNP, rs4847015
• Well distributed across populations
• Well distributed across STR alleles
• Does NOT increase the number of alleles…
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
12. D1S1656 has one SNP, rs4847015
and it is always associated with x.3 alleles:
15.3
16.3
17.3 rs4847015 = T
18.3
19.3
Category Locus rs Number
African
American
Caucasian Hispanic
African
American
Caucasian Hispanic
D1S1656 rs4847015 0.191 0.336 0.311 5 5 4 0
rs11063971 0.044 0.086 0.078 1 1 1
rs11063970 0.044 0.086 0.078 1 1 1
rs11063969 0.044 0.086 0.078 1 1 1
rs75219269 0.044 0.086 0.078 1 1 1
rs199970098 0.029 - 0.011 2 - 1
0vWA
SNPs Associated with
Repeat Region
Sequence Variant
Minor Allele Frequency Number of Associated STR Alleles Alleles Gained
with Flanking
SNPs
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
13. Rare SNPs were identified
in the sequenced flanking regions of:
D2S441 D18S51
CSF1PO Penta D
D8S1179 D22S1045
D10S1248
Category Locus rs Number
African
American
Caucasian Hispanic
African
American
Caucasian Hispanic
D2S441 rs74640515 0.015 0.014 0.067 1 1 2 2
CSF1PO C-T 36bp 5' - - 0.011 - - 1 1
D8S1179 rs138862078 0.007 - - 1 - - 1
D10S1248 T-G 2bp 3' - 0.007 - - 1 - 1
D18S51 rs535833682 0.029 - - 3 - - 1
Penta D rs7279663 0.022 0.014 0.011 1 2 1 2
D22S1045 rs190864081 0.007 - - 1 - - 1
Minor Allele Frequency Number of Associated STR Alleles Alleles Gained
with Flanking
SNPs
Rare SNPs
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
14. Category Locus rs Number
African
American
Caucasian Hispanic
African
American
Caucasian Hispanic
rs13422969 0.135 - 0.011 2 - 1
rs115644759 0.022 - - 2 - -
rs149212737 - 0.007 - - 1 -
TH01 rs79373318 0.148 - - 3 - - 3
TPOX
Minor Allele Frequency Number of Associated STR Alleles Alleles Gained
with Flanking
SNPs
Population / Allele
Specific SNPs
4
TPOX: rs13422969 and two additional rare SNPs
• Well distributed across Africa, rare elsewhere
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
15. Category Locus rs Number
African
American
Caucasian Hispanic
African
American
Caucasian Hispanic
rs13422969 0.135 - 0.011 2 - 1
rs115644759 0.022 - - 2 - -
rs149212737 - 0.007 - - 1 -
TH01 rs79373318 0.148 - - 3 - - 3
TPOX
Minor Allele Frequency Number of Associated STR Alleles Alleles Gained
with Flanking
SNPs
Population / Allele
Specific SNPs
4
TPOX: rs13422969 and two additional rare SNPs
• Well distributed across Africa, rare elsewhere
• Associated with “9” allele
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
16. D16S539: rs1728369 and one additional rare SNP
• Well distributed across populations and alleles
Category Locus rs Number
African
American
Caucasian Hispanic
African
American
Caucasian Hispanic
rs1728369 0.404 0.171 0.278 4 4 4
G-A 94bp 5' - - 0.011 - - 1
D2S1338 rs6736691 0.221 0.264 0.300 8 6 4 3
5D16S539
Single "Old" SNP
(not population or
allele specific)
Minor Allele Frequency Number of Associated STR Alleles Alleles Gained
with Flanking
SNPs
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
17. Category Locus rs Number
African
American
Caucasian Hispanic
African
American
Caucasian Hispanic
G-T 4bp 3' 0.316 0.257 0.189 7 5 6
rs25768 0.228 0.257 0.156 5 5 5
rs146841551 0.029 - - 1 - -
rs541272009 0.007 - - 1 - -
rs9546005 0.309 0.286 0.333 5 4 7
rs73525369 0.066 - - 3 - -
rs73250432 - 0.014 0.011 - 2 1
rs146621667 0.015 - - 2 - -
4bp del 8bp 3' - - 0.022 - - 2
4bp del 21bp 3' 0.007 - 0.022 1 - 1
rs16887642 0.177 0.050 0.022 4 2 2
rs7789995 0.015 0.164 0.122 2 4 4
rs7786079 0.176 0.021 0.011 5 2 1
1bp del 21bp 3' - - 0.011 - - 1
D7S820 15
Multiple SNPs in
Haplotype
16D13S317
D5S818 17
Minor Allele Frequency Number of Associated STR Alleles Alleles Gained
with Flanking
SNPs
D5S818:
• G→T 4bp 3’
• rs25768
• two additional rare SNPs
Well distributed across
populations and alleles
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
18. NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
20. Conclusions
• Ideal SNPs for increasing allelic diversity are neither population nor
allele specific
• Multiple SNPs in haplotype may substantially increase allelic diversity
• Population studies are needed to associate flanking region
polymorphisms with STR alleles
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
21. Flanking Region Polymorphisms in Practice
Problem is ambiguity
Full string is unambiguous,
intractable for reporting
Report differences from a reference genome
• Delineate length of amplicons
• Agree upon a reference genome
“My Wife and My Mother-in-Law”
William Ely Hill, 1915
Sample …CCAGTTACGACGCTAACTACTCTGAG…
Reference…CCAGTTACGACGCTAACTACTTTGAG…
Lab 1
Sample …CCAGTTACGACGCTAACT
Reference…CCAGTTACGACGCTAACTACTTTGAG…
Lab 3
Sample …CCAGTTACGACGCTAACTACTCTGAG…
Reference…CCAGTTACGACGCTAACTACTCTGAG…
Lab 2
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
22. Flanking Region Polymorphisms in Practice
Problem is ambiguity
Full string is unambiguous,
intractable for reporting
Report differences from a reference genome
• Delineate length of amplicons
• Agree upon a reference genome
• Report rs number (not always available)
• Report position and base differences,
e.g. Chr6:1004589:T
– No allowance for new genome versions
• InDel alignment parameters
“My Wife and My Mother-in-Law”
William Ely Hill, 1915
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
23. Looking Forward
Central repository for population sequence data at forensic STR loci
Labs generating worldwide population data will upload full strings
Database converts strings to tractable notation
• Bracketed repeat region
• Flank polymorphisms with length delineated
Database is searchable by repeat motif or string
Gatekeeper for
unusual sequences:
consistency and
back-compatible
allele calling
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
24. Acknowledgements
NIST
Pete Vallone
Kevin Kiesler
Nate Olson
Jo Lynne Harenza
Mike Coble
Becky Hill
Margaret Kline
Student Interns
Rachel Aponte - George Washington University
Harish Swaminathan - Rutgers University
Anna Blendermann - Mongomery College
Promega
Doug Storts
Jay Patel
Contact Information
katherine.gettings@nist.gov
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group
NIST Applied Genetics Group