SlideShare a Scribd company logo
1 of 106
GENOME TO PANGENOME :
A doorway into crop’s genome
exploration
KIRAN K.M
PGS20AGR8449
Department of genetics and plant breeding
MASTERS SEMINAR 1
UNIVERSITY OF AGRICULTURAL SCINECE, DHARWAD
Our journey through

How to capture this information? : Birth of pangenome
Introduction: How genome assisted crop improvement works and what sort
of information is missing from this approach ?
How much important this “missing information” is ?
Does the information's mined from pangenome oriented GWAS are worthy?
How to represent and analyze pangenome effectively to dugout new sort
of information?
What are application and future perspective of pan-genome oriented crop
improvement ?
Is pangenome the end of a story? : Conclusion
Entire genic/allelic variant forms within a
species
Domestication
Ecotype differentiation
Selection pressure
Birth or death of some
genes/modification via deletion,
duplication, transposition etc.
SINGLE-REFERENCE GENOME
Single reference genome oriented Comparative genome analysis
What if our reference genome is incomplete to capture whole information's ?
We need to capture entire genetic diversity of species : Doorway
into single reference-free pan-genome analysis
(Aggarwal, 2022)
“Boosting-up” of crop improvement programs
Genomic data
derived from multiple
accessions and cultivars
Full extent of
sequence variations
within a species
PAN-GENOMIC
approach to figure out new genes and alleles
directly related to phenotype
“A pangenome refers to the full complement of genes of a biological
clade, such as a species, which can be partitioned into a set of core genes that are
shared by all individuals and a set of dispensable genes that are partially shared
or individual speciïŹc.”
Hervé Tettelin Duccio Medini
✔ Pangenomes were first introduced by
Tettelin et al., to describe gene diversity
in Streptococcus agalactiae.
Michele Morgante
✔ Pangenomics in plants was first proposed by
Morgante et al.,
✔ 2014- First crop plant genome in - Soyabean
(Glycine max)
Pan-genomics : Evolution
Extensive structural variants (SVs)
Presence-absence variation
(PAV)
Copy number variation (CNV) Chromosomal rearrangements
Origin of SV’s
Recombination errors,
Non-allelic homologous
recombination (NAHR)
Replication errors
Microhomology-mediated break-
induced replication (MMBIR)
DNA break repair errors
eg: Non-homologous end
joining (NHEJ)
Non-reciprocal exchanges.
Homoeologous non-reciprocal
transpositions (HNRT)
Replication errors
Fork stalling and template
switching (FoSTeS)
PNV
Gabur et al., 2018
CNV
Polyploidization and/or
Whole-genome duplication
Structural variations (SV’s) : The un-tapped genetic potential
❑Resistance to biotic stress
⼚Rhg1 locus –resistance to cyst
nematode(soybean)
⼚Absence of sulfotransferase
gene in PAVs with various sizes-
resistance to striga (Sorghum)
⼚ Deletions in the Pi 21gene
results in quantitative and
durable resistance against blast
disease(Rice)
SVs affecting complex agronomic traits
❑Resistance to abiotic stress
⼚ PAVs Sub1A gene (Xu et al. 2006) encoding ERF like genes-submergence
tolerance&2 ERF genes SNORKEL 1&SNORKEL 2 (Hattori et al. 2009) - deep water
response(Rice)
⼚ Tolerance of phosphorus starvation at Pup1 locus –attributed to the presence of
a receptor like cytoplasmic kinase gene PSTOL1(rice)
❑Plant architecture
⼚ Extra copy of Rht-D1b resulting from a duplication of a >1Mb region causes
>70%reduction in plant height (wheat)
❑Yield and grain quality
⼚ 1212-bp deletion 5 kb downstream of the GW5 gene causes variation of grain
width and grain weight(Rice)
❑Flowering Time
⼚ A CNV at the HvFT1 locus was found to be associated with flowering time
variation (Barley)
Compared with the entire pan-genome, genes in the flexible genome were
significantly enriched with those involved in biological processes, such as defense
response, photosynthesis and biosynthetic processes
Dynamics of pan-genome compartments
⼚Gene birth and death processes
‱ Errors during recombination
‱ De novo genes
‱ Duplication followed by rapid divergence,
neo-functionalization
⼚Transposable elements
‱ In maize, helitron TE activity can modify
50% of the genome structure.
⼚Horizontal transfers
 Conjugation
 Transduction
 Transformation (Christine et al.,2019)
Pan-genome construction and assembly methods
de novo sequencing
and comparison
iterative mapping
and assembly
Graph-based approaches
1. Sequence and variation graphs (VGs)
2. Practical haplotype graphs (PHG)
â–Ș Errors in assembly and
annotation may lead to the false
calling of variation
â–Ș Costly, requires high-quality data
with high sequencing coverage
â–Ș Limiting the application to
relatively few individuals
Start with, Single reference genome as a
base for the pangenome
Whole genome
sequence data for
multiple individuals is
aligned to the
reference genome
Non-aligning sequence
reads are assembled
and added to the
reference to build a
pangenome
MULTIPLE GENOMES
Reference guided assembly
De-novo genome assembly
(Danilevicz, 2020)
(Eizenga et al., 2020)
Pan-genome graphs
‱ To represent the sequence content and the corresponding functional
annotation of an entire population, species, or a clade
‱ Here compressing redundant sequences into smaller data structures while
retaining information on genomic diversity and whole-genome
relationships
Ideally, a complete and fully annotated
pangenome graph would integrate genomic,
epigenomic, and transcriptomic datasets, thus
facilitating downstream functional and
comparative analyses
Genome region harboring CsFT locus among six cucumber accessions
44.0 kb complex
insertion
25.3 kb complex
insertion
39.3 kb canonical
insertion
Li et al 2022,
Nature communication
Variation Graph (VG)
Practical haplotype group (PHG) database and haplotype creation/ Trellis graph
representing genic and intergenic regions.
Jensen et al 2020,
The plant genome
PHGs is to determine
which haplotypes or genotypes of
parental haplotypes that have been
sequenced at high coverage are
present in progeny that have been
sequenced at low coverage
❖ Extrapolation of pangenome size leads to a
‱ Predicted pangenome of 63,865±31 genes (37,766±62 gene families
‱ Predicted core genome size of 49,740±164 genes (28,496±91 gene families
Model describing the sizes of core and pangenome.
Orthologous gene clusters
61,379 genes
All genes
35,853 gene families
49,895 genes 28,532 gene families
A. G Agnieszka et al, 2016
Nature communications
DOI: 10.1038/ncomms13390
How many genomes to capture the whole genome content?
✔The core/pangenome ratio
below 85% shows huge
adaptation.
✔In plants, core genome
represents from 40-80% of the
total pangenome.
Munir et al.,2020
Power-law regression for new genes
The numbers n of new genes are plotted
for increasing values of the number N of
genomes sequenced.
(Tettelin et al. 2008)
✔Blue curves are least-squares fit of the
exponential function, as in the original
pangenome model.
✔Red curves are least-squares fit of the
power-law function.
Open pan genome
Closed pan genome
Closed (α > 1)
Open (α < 1)
Software / Tool Description / Role URI link
PanSeq Extract the regions unique in the genome, Identify
the SNPs and construct the file for phylogeny
programme
https://lfz.corefacility.ca/panseq/
PanFunPro x Homology detection and pairwise genome analysis
in pan/core genome.
https://zenodo.org/record/7583#.YTR3
6p0zY2w
PGAP Detection of homologous genes, orthologous genes,
SNP, phylogenetic studies, pangenome plotting and
functional annotation.
http://pgap.sf.net
PanACEA Identification of genomic regions those are
phylogenetically dissimilar.
https://github.com/JCVenterInstitute/P
anACEA
PGAP-X Genome diversity and visualize genome structure
and gene content to understand the evolution.
http://pgapx.ybzhao.com/
PAN2HGENE. To identify new products, resulting in altering the α
value behavior in the pangenome without altering
the original genomic sequence.
https://sourceforgenet/projects/pan2h
gene-software
BGDMdocker. For pangenome analysis, visualization, clustering and
genome annotation
https://www.docker.com/whatisdocker
Aggarwal et al., 2022
Tools – pangenome analysis
MATERIALS AND METHODS
Pan-Genome Assembly and Annotation
Gene Presence-Absence Variations(gPAVs)
SNP Discovery and Annotation
Sorghum Diversity and Population Structure
Genome-Wide Association Analysis (GWAS) : Two different mapping
populations having the phenotypic data of 10 traits
Drought RNA-Seq Assay Analysis
Pan-Genome Assembly and Annotation
Iterative mapping and assembly approach
Start with Sorghum reference
assembly v3.0.1 and adds on whole
genome sequence data iteratively
Compared the aligned sequence with NCBI
non-redundant nucleotide databases
BLASTn and the sequences with homology to
sorghum mitochondria, chloroplast
sequences and homology with Viridiplantae
taxonomy (outside the green plant group)
Remove these homologous
sequences
Unmapped reads were assembled
Sorghum reference assembly v3.0.1.
Reads from 176 sorghum accessions with
a minimum of 10X coverage sequence
data were mapped to the sorghum
reference v3.0
(The assembled contig sequence more than 500 bp length was only considered and appended to
reference genome sequence.)
Bowtie2 v2.3.4,
IDBA_UD assembler,
REPEATMASKER-v4.0.7
Masked repetitive elements (>90 percent
coverage with greater 90 percent identity)
using sorghum as the species
AUGUSTUS v3.3.2
The sorghum expressed sequence tags
(ESTs) from GenBank were aligned
with tBLASTx and genes were prediction
with homology and ab initio method
On an average each iteration of the process added 1.9Mb
263.7 Mbp
89.2Mb removed
RNA-Seq mapping hints from the 25 accessions
used for combining-evidence based ab initio gene
prediction and the 3,589 genes supporting the
mapped expressed sequence tags (EST) sequences
were retained.
Identified 11,057 to 17,616 variable genes in the
176 genomes,
✔ Average gene sequence length- 1,567 bp
✔ Average exons per gene - 3.6
Sorghum Pan-Genome Gene PAV
(gPAV)
R “ape” package to construct an
NJ (neighbour joining) tree
Whole-genome sequence reads of all 354
sorghum accessions were mapped with
Bowtie2 v2.3.4
Genes models on contigs longer than 1 Kbp were used in this analysis.
Clustal analysis
all-by-all BLASTp followed by MCL
The gene enrichment analysis R
“topGO” package- using “Elim” method.
SNP Discovery and Annotation
whole genome sequence reads of 354
accessions were quality trimmed using
Trimmomatic
Bowtie2 v2.3.4 – Paired reads mapping
with pangenome
Picard tools- To filter out read duplication
SNP functionally annotated with
SnpEff v.4.3
Variants against the reference (pan-
genome) were called with GATK v.4.1
â–Ș Closed type pangenome- 35,719 Genes
‱ 18,898 variable genes
â–Ș 30 genes  uniquely present
â–Ș 3,183 (8.9%)  uniquely absent
35719
16821 (47%)
RESULTS
â–Ș Total- 2 million SNPs ; 91319 SNP’s in extra
contig assembly
â–Ș Variable gene length is shorter with few exons
â–Ș Variable genes have fewer synonymous SNPs
and similar non-synomymous SNPs compared to
core gene
A - Reference whole genome sequence reads mapping
B - Drought expression (RNASeq) sequence mapping
density
C - Gene density
D - Genes commonly present in all accessions (core genes)
E - genes absent in at least one of the accessions (variable
genes)
F - SNP density
G - Insertions and deletions (indels).
SNP density
‱ Extra contig : 0.52/Kbp
‱ Rest : 2.71/kb
‱ 210,805 Contigs
‱ Minimum contig length : 500 bp
gPAVs- based neighbour-joining tree
with Histogram bars
â–Ș Among the 35,719 total genes, 53%
exhibited the genic variations to
estimate the relationship among
the accessions
✔ The largest number of genes
â–Ș uniquely present : Macia (9
genes)
â–Ș uniquely absent : PI660645 (372
genes)
This indicated the evolutionary
distance from other accessions
The Ka/Ks ratio estimating the balance between
neutral mutations, purifying selection, and beneficial
mutations on a set of core and variable genes
Distribution of Infinium SNP array markers on chromosome
Principal co-ordinate analysis
✔ Three different clusters with one of them having two groups (Caudatum and Kafir )
✔ Durra and guinea sorghum races displayed identifiable clusters
✔ Caudatum and Kafir accessions exhibited the admixtures
â–Ș Total of 111 genes among total variable genes are
race-specific
â–Ș unique genes from durra associated with,
✔ Heat shock protein, LRR repeat protein, L-type
lactin-domain receptor, ABC transporter family
proteins, and Ras-related proteins.
â–Ș Guinea group unique genes associated with
✔ disease resistance protein, betaglucosidase
proteins, NRT1/PTE protein family, etc.,
GENE CLUSTER ANALYSIS Identified
⼚ 11,470 gene families
⼚ Un-clustered genes (6,057)
â–Ș 556 from the non-reference genes and the remaining 5,501 from
reference genes.
Specific and common genes across races
Genome-Wide Association Analysis (GWAS) with two different
mapping populations having the phenotypic data of 10 traits
POPULATION 1 POPULATION 2
â–Ș A subset of 227 accessions from
the 354 WGS set belonged to four
major races of sorghum having
representation from Africa, Asia,
and America was use
â–Ș The phenotype and genotype data
associated with
⼚ Plant height (PH),
⼚ Dry biomass (DBM), and
⼚ Starch (ST)
â–Ș The stay-green fine-mapping
population developed by crossing an
introgression line cross RSG04008-6 ×
J2614-11 was used for association
study using the pan-genome assembly
⼚ Green leaf area (GLA)
⼚ Glossy (GL) leaf
⼚ Sheath pigment (LSP)
⼚ Plant vigor (V),
⼚ Trichome low (TL),
⼚ Trichome up (TU),
⼚ Soot fly dead hearts (SFDH)
Pan-genome helps identifying novel genes
1
Significant association of SNP’s for plant biomass on chromosome 9
Significant association of SNP’s for plant height on extra-contigs
✔ From Population 1 : A total of 36 SNPs on extra contigs found associated with target
traits. Among them,
10 SNP
25 SNP
1 SNP
✔ From Population2 : Trait Green leaf area (GLA) significant association with five SNPs
on extra-contigs
Starch (ST)
Dry bio mass (DBM)
Plant height (PH)
Identification of the Drought Candidate Genes
â–Ș A sorghum RNASeq data generated from
â–Ș 79 out of 1,788 total drought responsive
genes(differentially expressed genes) were reported
from genes on assembly sequence (extra- contig).
Drought-resistant Susceptible
BTx623 (DR1) Tx7000 (DS1)
SC56 (DR2) PI482662 (DS2)]
6 hr- Treatment ,
DR 1 Data-set :14 (13 up and 1 downregulated) and
DR 2 data-set : 34 (31 up and 3 down-regulated)
genes from novel sequence were expressed
â–Ș Over-all, Five drought-related genes were co-mapped with the trait-
associated genes. Among this,
Two drought resistance specific genes Sobic.005G069800 and
Sobic.006G127800 were linked to Plant height and Sheath pigment (LSP) traits.
DR 1 Data-set DR 2 Data-set
Venn diagram
Functional consequences of
new transposable element
insertions
Possible effects on gene
product structure
Transposable elements (TEs) , a driver of structural variation
The TE insertions were shown to be associated with changes in methylation,
chromatin accessibility and potentially regulatory functions
Possible effects on gene
product abundance
TEs as novel regulatory elements
TEs carrying ACRs are enriched for association with higher expression of
nearby genes, indicates their role as novel regulatory elements
(a) Insertions of transposons into
genes/regions of accessible chromatin
regions (ACR’s) or regulatory elements
Might often result in reduced expression
of nearby genes or altered patterns of
expression
(b) Insertion of TE’s that contain ACR’s
Might act as mobile enhancers that
affect the expression of both the TE
promoters and nearby gene promoters
ie re-wiring of transcription of nearby
genes
(Noshay et al., 2020)
Pangenome : A tool to unveil the hidden role of
Transposable elements(TE’s) in crop evolution
Eight high-quality genomes reveal pan-genome
architecture and ecotype differentiation of Brassica napus
VOL 6 | 2020
SNP-based GWAS versus PAV-based GWAS
: case study for silique length(SL), seed weight (SW) and flowering time in Brassica napus.
Manhattan plots of SNP-GWAS and PAV-GWAS for silique length.
GWAS (-lmmm 1: Wald test) was performed with 3,971,412 SNPs or 27,216 PAVs in the BN-NAM population
containing 2,141 RILs.
Although the peak SNP on chromosome A09
fell within the previously reported region
identified by traditional quantitative trait
locus mapping and positional cloning.
‱ None of the associated SNPs was located
in the regulatory region or coding
sequence of the target gene
BnaA9.CYP78A9
‱ Encouragingly, PAV-GWAS directly
detected the 3.9-kb CACTA-like TE
inserted upstream of the
BnaA9.CYP78A9 promoter region(P450
monooxygenase), which was identified as
the causal variation for SL and SW
Phenotype data of silique length in eight B.
napus accessions.
‱ Experiments were repeated five times with
similar results.
Phenotype data of seed weight in eight
assembled B. napus accessions.
A 3.6-kb CACTA-like insertion as lead PAV
of BnaA09.CYP78A9 promoter region.
Pangenome Revealing secret of niche specific fitness
2
Tapidor
Quinta
Gagan
ZS11
Shengli
Zheyou7
Westar
No2127
Winter type
(WORs)
Semi-winter type
(SWORs)
Spring type
(SORs)
Eight B. napus accessions
Neighbour-joining tree of 210 B. napus accessions, eight
assembled accessions and 199 B. rapa accessions
Insertions of four transposable elements around BnaA10.FLC in different ecotypes
✔ Validated these TEs in 210 B. napus accessions (141 of which had ecotype information)
The role of FLC genes in the divergence of the three rapeseed ecotypes
SWORs
WORs SORs
✔Due to the LINE insertion in the first exon of BnaA10.FLC, the loss-of-function
mutation makes SORs require weak or no vernalization.
✔An 824-bp hAT insertion in the last exon of BnaA02 FLC was identified as
the lead PAV by PAV-GWAS in SOR (Spring)Type
✔The MITE insertion in the promoter region of BnaA10.FLC enhances the
expression of BnaA10.FLC which leads to a requirement of strong
vernalization for WORs.
✔A demand for vernalization of SWOR is somewhere between the other two
ecotypes due to the hAT insertion in the promoter region of BnaA10.FLC
CONCLUSIVE RESULTS
Indicating a strong correlation between specific TE insertions in
BnaA10.FLC and ecotype classification
Haplotypes of six SNPs and the three
TEs located within the 5.0-kb
upstream and downstream regions
and the coding sequence of
BnaA10.FLC
Pangenome uncover potentiality of Transposable
elements(TE’s) as powerful molecular markers
3
CROP Transposable elements Associated trait
Maize A Harbinger-like DNA transposon Represses the expression of the ZmCCT9
gene to promote flowering under long-day
conditions
Rice A Gypsy retrotransposon Enhance the expression of the OsFRDL4
gene and promote aluminum tolerance
Tomato Two Copia retrotransposons
independently inserted into the promoter
region of the orange Ruby gene
Enhanced expression and driving convergent
evolution of the blood orange trait
maize Ac/ fAc ( hAT family element) transposon Induce expression of pericarp color 2 gene
(p2) by capturing the enhancer sequence of
another gene
The tomato pangenome un-covers new genes and a
rare allele regulating fruit flavor
4
(Gao et al., 2019).
Pangenome un-covers rare alleles
‱ Solanum pimpinellifolium (SP)
‱ Solanum cheesmaniae ssp galapagense (SCG)
‱ Solanum lycopersicum L. var. cerasiforme (SLC)
‱ Solanum lycopersicum L. lycopersicum (SLL)
Phylogenetic and principal component analyses
(PCA) using the PAVs suggested that wild
accessions clearly separated from domesticated
accessions with only a few exceptions, and the two
domesticated groups (SLC and SLL) separated but
with clear overlaps
Violin graph
Principal component analyses (PCA)
“Who will last in the Run?”
Scatter plots Gene selection preference during tomato domestication and improvement
A rare promoter allele that modifies fruit flavor
Pan-genome analysis ~4-kb substitution in the promoter region of TomLoxC
(Solyc01g006540) Encodes a 13-lipoxygenase essential for C5 and C6 green-leaf
volatile production in tomato fruit
4,151-bp nonreference allele of the TomLoxC promoter captured in Pan-genome
Rare allele in cultivated tomatoes
that reflects strong negative selection during domestication.
‱ TomatoPan028690Truncated part of a fruit weight gene -Cell Size
Regulator (CSR) was detected in- All SP, 88.6% of SLC and 14.4% of SLL
heirlooms.
✔This supporting that the deletion allele arose during domestication and
has been largely fixed in cultivated tomatoes.
Human selection influenced fruit quality or additional phenotypes in
some instances by targeting regulatory sequences
S. pimpinellifolium SP
(47.4%)
Modern SLL cultivars
(7.2%)
All heterozygotes
S. cheesmaniae SLC
(8.4%)
SLL heirlooms
(1.1%),
Most likely because of recent introgressions from wild into cultivated tomatoes. consistent with its selection during modern breeding,
possibly the consequence of selecting lines with superior stress tolerance in agricultural settings
The frequency of the non-reference allele
✔ Expression levels of TomLoxC in
orange-stage fruit of accessions with
different promoter alleles
✔ Heterozygous TomLoxC promoter
genotypes have the strongest
expression in orange-stage fruit.
5 Pangenome helps trace back to domestication trajectory
Nature | Vol 588 | 10 December 2020
‱ Constructed chromosome-scale sequence assemblies
for 20 accessions
‱ Paired-end and mate-pair Illumina short reads were assembled into scaffolds
‱ Chromium linked-reads and chromosome conformation capture (Hi-C) data to arrange scaffolds into
chromosomal pseudomolecules using the TRITEX pipeline
‱ Use single-copy pan-genome for genetic analysis in
a wider diversity panel -single-copy regions extracted from each of
the 20 assemblies and clustered into a non-redundant set of sequences
Translate single-copy sequences variation into scorable markers which
are amenable to population genetic analysis and association scans
Genome-wide association scan for lemma adherence on the basis of PAV markers
Lemma adherence covered - NUDUM (NUD) gene
INFERENCE
All varieties of naked barley are thought to trace back to a single mutational
event, deleting the entire NUD sequence
How much significant a single-copy pangenome is ?
Mapping of polymorphic inversions in population
Objective : To discover inversions in a broader set of germplasm
‱ Hi-C-based inversion scans on Hi-C data of a diversity panel mapped to a
single reference genome
✔Among 69 barley genotypes (67 domesticated and 2 wild accessions)
revealed a total of 42 events that ranged from 4 to 141 Mb in size
(mean size of 23.9 Mb)
✔A notable finding was the prevalence of large (more than 5 Mb in size)
inversion polymorphisms in current elite germplasm
6 Mapping of polymorphic inversions in population
Identification and characterization of a large inversion on chromosome 7H
1. RGT Planet (Inversion carrier) × Hindmarsh  (R × H)
2. Morex × Barke (M × B) Mapping population
 Earliest cultivar that carried the inversion was Diamant. As one of the
donors of the semi-dwarf growth habit
This strongly suggests that mutation breeding in the 1960s has given rise to
a cryptic large inversion, which—unbeknownst to breeders—segregates in
elite varieties of barley
INFERENCE
‱ Map of inversion polymorphisms will provide breeders with a point of
reference to avoid or interpret correctly the crosses between carriers and non-
carriers.
 Diamant -Highly influential founder line of modern barley breeding
and traces back to a mutant induced by gamma irradiation of the
Czech cultivar Valticky.
 Gene bank/ Germplasm study : None of the Valticky samples carried
the inversion, whereas it segregated in the Diamant samples
Expanding Gene-Editing Potential in Crop Improvement
with Pangenomes
Identify non-recombinant inversions in
pangenome- High precision identification of
chromosomal re-arrangement boundaries
CRISPER Protein complex
Induce inversion- Re-inversion
Genes locked in the region is accessible to
recombine in population
Reversal of inversion through CRISPR tech. allow crossing genes in inverted regions
(Fernandez, 2022)
7
✔CRISPR-Cas can be used to study the effect of gene dosage by generating a
series of allelic mutants through knock-out/down mutation of specific variant
alleles
Eg : Pleotropic effects of mlo gene (barly) against powdery mildew
✔Potential benefits of using pangenome reference for genetic modification, as
1. The genetic diversity analysis can be helps to identify potential target
site for genome editing
2. Identify CNV that influence CRISPR-Cas mutation effectiveness
3. identify novel target alleles and map their position on pan-genome
4. Avoid off-target effect in multiplex editing by designing specific sgRNAs
Thus supporting accurate and specific guide RNA design
Crop pangenome Reference
Maize pangenome
(66 inbred lines)
✔ Identified inveresions Largest inversion spanning 75.5 Mb
in the pericentric region of chromosome 2
Schwartz C et
al., 2020,
Nature plants
cotton pangenome
(890 accessions )
Meta-GWAS and gene expression analysis –
Gene knockout with CRISPR-Cas9:
✔ Identified previously uncharacterized gene GhIDD7
subsequently shown to control fibre length
Li et al.,
2021,
Genome Biol.
Rice pangenome
(66 accessions)
“Green revolution
phenotype”
✔ Identified 129 conserved gene loci
✔ CRISPR-Cas knock-out/down study:- uncovered 31
high yield-related genes, including six previously
reported genes such the sd1 semi-dwarf gene
Huang j et
al., 2018
Role of Cis-regulatory elements (CREs) and their Pan-
genome identification for fine tuning of gene expression
❑ The CREs are noncoding DNA sequences capable
of recruiting transcription factors and affecting
gene expression
❑ The CREs can be broadly subdivided into
promoters and enhancers or silencers
(Zanini et al., 2021)
8
Genome editing of cis-regulatory elements: a hypothetical scenario of editing of Brassica
napus CLV3 homologues’ cis-regulatory elements to generate multiocular siliques and range
of variation in seed number. Brassica napus has two, mostly redundant, copies of BnCLV3,
so editing of both would be necessary
(Xu et al., 2021; Yang et al., 2018)
Importance of pan-genomics as approach to explaining
heterosis
❖ Pan-genomics can play an important
role in unraveling gene members and
families contributing to heterosis,
according to the proposed model
❖ A new gene and variant finding is
essential to explaining and utilizing
heterosis for crop improvement.
A model of heterosis proposed by Swanson-Wagner et al.,
10
Pan-genome : A resource to explore the Breeding
Potential of Under-utilised Crop Species
Guava Investigate fruit and leaf metabolites and fruit aroma volatiles of 27
guava accessions .These datasets could be used to scan a guava pangenome for
fruit related traits
Integrating rich
phenotype data
A super-pangenome of yam
bean species (P. erosus, P. ahipa
and P. tuberosus
Helps to infer the effects of SVs on phenotype,
including traits directly related to plant
performance such as
✔ Day to flowering and maturity,
✔ plant height
✔ root biomass
By developing resources for under-utilised crops, novel genes related to agro-morphological
traits can be detected and used to inform breeding programs or used for introgression
11
CURRENT STATUS AND FUTURE
ASPECTS OF PANGENOMIC STUDIES
A summary of plant pangenome studies.
Species Single reference
size
Pangenome
size
Traits studied using the
pangenome
Variant
type
Reference
Brassica
oleracea,
B. macrocarpa
(cultivated and
wild cabbage)
(Bo TO1000) 488
Mb; 59,225 gene
587 Mb;
61,379 genes
Disease resistance,
flowering time, secondary
metabolites
PAV Golicz, et
al.
2016
Cajanus cajan
(pigeon pea)
(Asha) 606 Mb;
53,612 genes
622 Mb;
55,512 genes
Self-fertilization, disease
resistance, seed weight
SNP,
PAV
Zhao et
al., 2020
Glycine soja (wild
soybean)
(GsojaD,
Shandong) 985
Mb; 57,631 genes
986.3 Mb;
59,080 gene
families
Disease resistance,
flowering time, oil content,
height and lodging, yield
CNV,
PAV,
SNP,
InDel
Li et al.,
2014
Gossypium
hirsutum
(upland cotton)
(TM-1) 2,347 Mb;
70,199 genes
3,388 Mb;
102,768 genes
Flowering time,
morphology, yield, fiber
traits
PAV,
SNP
Li et al.,
2021
Oryza sativa (Nipponbare)
384 Mb
Indica- 52976 Disease, stress resistance,
grain width and size
SNP Yao et al.,
Zea mays (maize) B73, 2,182 103,538
genes
Disease resistance,
flowering time
PAV,TE
insertion
Hufford et
al., 2021
Pan-genome Array (RPGA): an efficient genotyping solution
for pan-genome-based accelerated crop improvement in
rice
Anurag Daware , Ankit Malik , Rishi Srivastava , Durdam Das , Ranjith K.
Ellur , Ashok K. Singh , Akhilesh K. Tyagi and Swarup K. Parida
✔ “Rice Pan-genome Genotyping Array (RPGA)” is a first-ever pan-genome-
based SNP genotyping assay developed for crop plants
✔ Efficiently capture haplotype variation from the entire 3K rice pan-genome
representing diverse population (Indica, Tropical/Temperate japonica, aus
and Aromatic, etc.)
✔ RPGA assays total of 80504 SNPs including 60026 SNPs from 12 Nipponbare
chromosomes and 20478 SNPs from 12 pseudo127 chromosomes of 3K rice
pan-genome.
(2022)
‘RICE PAN-GENOME GENOTYPING ARRAY’ ANALYSIS PORTAL(RAP)
http://www.rpgaweb.com 3K Rice Reference Panel and subsequent GWAS
“Super-pangenome is the approach of developing a pangenome of the pangenomes of
different species for a given genus”.
Super-pangenome: A way forward
Khan,W.et al.,2020
Approaches for the construction of super pangenome
 Super-pangenomes support the breeding of crops better adapted
to diverse environments and more resilient to climate change by
analyzing gene frequency change during domestication/ evolution
Super-pangenome study involiving polyploid Brassica napus and its
two diploid progenitor genomes gives,
‱ Comparative modelling of the gene loss propensity in diploid and
polyploid Brassica sp.
 Diploids- Primarily associated with transposable elements
 Polyploid, B. napus - Associated with homoeologous
recombination.
 Identification of beneficial haplotypes that could be introgressed
through conventional breeding
(Bayer et al 2021, Plant Biotechnology Journal)
 Tomato super-pangenome identified functional polymorphisms in the
genes associated with fruit flavour(LIN5, ALMT9, AAT1, CXE1, and LoxC ).
 Cotton super-pangenome give knowledge on Genomic diversity
among five polyploids and their monophyletic origin
 Polyploidy genomes are conserved in gene content and synteny
 Diversified by sub-genomic transposon exchanges that equilibrate
genome size, Evolutionary rate, and positive selection between
homeologs within and among lineage
 The super-pangenome of banana identified
Gene differences between Musa and Ensete genera , as well as 12,310
new gene models in the species, forming distinct PAV clusters between the
Ensete and Musa accessions
(Chen et al., 2020 Nature genetics)
APPLICATIONS OF PANGENOME
1. Finding novel genes
2. Revealing niche specific fitness
3. Evolution, Domestication and Breeding History
4. Helps to identify potential target site for genome editing
5. Facilitating taxonomic identification
6. Approach to explaining heterosis
7. Elucidating host pathogen interaction
8. Strengthening proteogenomic
0
CONCLUSION
WHAT TO ADD?
&
WHERE TO ADD?
Beyond pan-genome ?
Pan-Transcriptome
Potent bioinformatic tools
Pan-Metabolome Pan- Epigenomes
THANK’s
to science
 Comparative modelling of the propensity for gene loss in the three
species revealed that in the diploids, genes with propensity for loss are
primarily associated with transposable elements, while in the polyploid
B. napus, propensity for gene loss was associated with the position of
the gene on the pseudomolecule
 Studying how genes change in frequency between domesticated crops
and their wild relatives using
Rapeseed (Brassica napus) reference genomes,
Two Winter type oilseed rapes (WORs) (Darmor-bzh2 and Tapidor8)
Two semi-winter oilseed rapes (SWORs (ZS11 and NY7)
Genome-wide comparative analysis of eight well-assembled genomes and the
Darmor-bzh genome and identified the coregene clusters, dispensable gene
clusters and specific gene clusters.
Created by,
⼚ ZS11 de novo assemblies using PacBio, Hi-C and Bio-Nano data
⼚ Other seven accessions were obtained by integrating high-coverage PacBio
and Illumina data; two of them were verified by Hi-C or BioNano data.
Multiple high-quality reference genomes representing different ecotypes are
necessary for a better understanding of the genome structure and genetic basis
of morphotype
Materials and methods
GWAS of flowering time in the Nested association mapping (NAM) population.
Manhattan plots for flowering time analyzed by
SNP-GWAS in winter and spring environments,
respectively.
Manhattan plots for flowering time analysed
by PAV-GWAS in winter and spring
environments, respectively..
GWAS (-lmmm 1: Wald test) was performed with 3,971,412 SNPs or 27,216 PAVs in the BN-NAM
population containing 2,141 RILs.
Insertions of four transposable elements around
BnaA10.FLC in different ecotypes
✔ Four TEs were identified in the
promoter and coding region of
BnaA10.FLC
✔ Validated these TEs in 210 B. napus
accessions (141 of which had ecotype
information)
The role of FLC genes in the divergence of the three rapeseed
ecotypes.
RESULTS
✔All the WORs contained the MITE
insertion
✔ 85% (22/26) of the SORs contained
the LINE insertion
✔81% (80/99) of the SWORs contained
the hAT insertion
SWORs
WORs SORs
Flowering time of lines with different BnaA02.FLC alleles in spring & winter respectively.
spring
Spring Winter
An 824-bp hAT insertion in the last exon of BnaA02 FLC
was identified as the lead PAV by PAV-GWAS.
SOR (Spring)Type
✔Due to the LINE insertion in the first exon of BnaA10.FLC, the loss-of-
function mutation makes SORs require weak or no vernalization.
✔The MITE insertion in the promoter region of BnaA10.FLC enhances the
expression of BnaA10.FLC which leads to a requirement of strong
vernalization for WORs.
✔A demand for vernalization of SWOR is somewhere between the other
two ecotypes due to the hAT insertion in the promoter region of
BnaA10.FLC
✔An 824-bp hAT insertion in the last exon of BnaA02 FLC was identified
as the lead PAV by PAV-GWAS.
CONCLUSIVE RESULTS
SOR (Spring)Type
824 bp hAT insertion in the last
exon of BnaA02 FLC
BnaA02.FLC has a stronger flowering repression effect than BnaC02.FLC47
BnaA02.FLC has a stronger flowering repression effect than BnaC02.FLC
Both possess BnaA10.FLC
Tapidor
Quinta
Winter type
(WORs)
Tapidor Two copies of BnaA02. FLC
One copy of BnaC02.FLC
Quinta
This may be cause the difference in
flowering time between them
One copy of BnaA02.FLC
Shengli
Zheyou7
Tapidor
BnaC02. FLC gene is replaced by BnaA02.FLC
BnaA02.FLC was not expressed in any stage + one functional BnaC02.FLC
No2127
Gangan
Westar
Gene BnaC02. FLC is completely absent
✔ The cumulative expression levels of three FLC genes and the flowering time
characterization of eight assembled B. napus accessions associated with PAVs and
copy number, among the eight accessions
Stacked histogram showed FLCs expression in T0–T3
24 (T0) ; 54 (T1); 82 (T2); 115 (T3) DAYS AFTER SOWING
Spring type (SOR)
Semi winter type
Spring type (SOR)
‱ Among the unfavorable genes, seven were not full length.
‱ TomatoPan028690Truncated part of a fruit weight gene -Cell Size Regulator
(CSR) was detected in- All SP, 88.6% of SLC and 14.4% of SLL heirlooms,
✔This supporting that the deletion allele arose during domestication and has
been largely fixed in cultivated tomatoes.
Selection of promoter PAVs during tomato breeding.
A total of 90,929 nonreference contigs
3,741 nonreference sequences localized in putative promoter regions
980 promoter sequences under selection Checked the expression of their
downstream genes(RNA-Seq data, for orange-stage fruit stage ) in the 397
accessions
checked PAV patterns of these promoters, as well as those
in the reference genome
RESULT - Of these promoters, 240 had downstream genes with significantly
different expression
Human selection influenced fruit quality or additional phenotypes
in some instances by targeting regulatory sequences
A rare promoter allele that modifies fruit flavor
Pan-genome analysis ~4-kb substitution in the promoter region of TomLoxC
(Solyc01g006540) Encodes a 13-lipoxygenase essential for C5 and C6 green-leaf
volatile production in tomato fruit
4,151-bp nonreference allele of the TomLoxC promoter captured in Pan-genome
Rare allele in cultivated tomatoes
that reflects strong negative selection during domestication.
Involvement of TomLoxC in apocarotenoid biosynthesis confirmation,
✔QTL mapping TomLoxC as the cause of changed levels of flavor-
associated lipid- and carotenoid-derived volatiles.
✔Analysis of transgenic tomato fruit (TomLoxC expression was repressed)
revealed a previously unknown alternative apocarotenoid production
route.
❑ The tomato pan-genome harbors useful genetic variation which
was unvisible on the ‘Heinz 1706’ reference genome alone.
❑ Tomato pan-genome revealed extensive domestication- and
improvement-associated loci and genes, with an evident bias
toward those involved in defense response
On average, each of the 20 genotypes
contained 2.9 Mb of single-copy sequence not
present in any other assembly
Procedure To test the suitability of the single-copy pan-genome for
genetic analysis add if time
To test the suitability of the single-copy pan-genome-
The abundance of 160,716 single-copy clusters that overlap structural variants was
estimated by counting cluster-constituent k-mers (k = 31) in sequence reads of the
diversity panel
‱ Local PCA and haplotype analysis in our panel of 200 domesticated and 100 wild varieties of barley indicated a
single origin of the inverted haplotype.
‱ The inversion occurred only among domesticated barley of Western geographical origin, indicating that it
arose or has risen to high frequency after domestication. The inverted region contains high-confidence genes
in the Morex cultivar. The closest gene to the inversion breakpoint—at 448 kb distance from the distal
breakpoint in the non-carrier Morex—was HvCENTRORADIALIS (HvCEN)
‱ Although induced mutants of HvCEN flower very early, natural variation in HvCEN has previously been
implicated in environmental adaptation to northern European climates.
‱ All of the inversion carriers we analysed had HvCEN haplotype III, which is associated with later flowering in
spring barley varieties from northern Europe
Neighbor-joining tree of 271 diverse rice
accessions belonging to three different cultivated
and wild rice species viz. O. sativa, O. nivara and
O. rufipogon
RPGA-based SNP genotyping for efficiently decoding the natural allelic diversity and
population genetic structure in order to understand the domestication pattern in rice
genepool.
✔Indian traditional Basmati accessions were found to cluster distinctly from
aromatic rice accessions belonging to both north-eastern India and other
parts of the world.
✔ Traditional Basmati which displayed a closer genetic relationship with
japonica and aus accessions
Evolved Basmati Traditional Indian basmathi X indica variety (IND 1)
✔ Identified 2 sub-groups within indica subpopulation,
INDI corresponding to Xian/Indica-2 (XI-2) South Asia
INDII corresponding to XI-3 from and Southeast Asia,
previously reported along with two other indica subpopulation groups (XI-1A from East Asia, XI-1B of modern
varieties of diverse origins)
RESULTS
High-resolution QTL mapping conducted using the RPGA-based ultra-
high-density 535 genetic linkage map ( Sonasal × PB 1121 RILs)
The RPGA-based GWAS detected many previously known major grain size/weight
genes like GS3 and PGL1 (grain length and length-to-width ratio) and GW5 (grain
width) validates the ability of pan-genome-based GWAS to detect true associations
WDR12 gene (candidate gene regulating grain length )underlying the qLWR7 QTL,-
validated by both RPGA based GWAS and QTL mapping, is known to and thus appears to be a
promising

More Related Content

What's hot

Association mapping
Association mappingAssociation mapping
Association mappingNivethitha T
 
Genomic aided selection for crop improvement
Genomic aided selection for crop improvementGenomic aided selection for crop improvement
Genomic aided selection for crop improvementtanvic2
 
Genomics and its application in crop improvement
Genomics and its application in crop improvementGenomics and its application in crop improvement
Genomics and its application in crop improvementKhemlata20
 
cisgenesis and intragenesis by Saurabh
cisgenesis and intragenesis by Saurabhcisgenesis and intragenesis by Saurabh
cisgenesis and intragenesis by Saurabhsaurabh Pandey.Saurabh784
 
Use of SNP-HapMaps in plant breeding
Use of SNP-HapMaps in plant breeding Use of SNP-HapMaps in plant breeding
Use of SNP-HapMaps in plant breeding Anilkumar C
 
Transcriptomics: A Tool for Plant Disease Management
Transcriptomics: A Tool for Plant Disease ManagementTranscriptomics: A Tool for Plant Disease Management
Transcriptomics: A Tool for Plant Disease ManagementSHIVANI PATHAK
 
Association mapping in plants
Association mapping in plantsAssociation mapping in plants
Association mapping in plantsWaseem Hussain
 
Presentation on Foreground and Background Selection using Marker Assisted Sel...
Presentation on Foreground and Background Selection using Marker Assisted Sel...Presentation on Foreground and Background Selection using Marker Assisted Sel...
Presentation on Foreground and Background Selection using Marker Assisted Sel...Dr. Kaushik Kumar Panigrahi
 
Transgenesis, Intragenesis and Cisgenesis: A Brief Review
Transgenesis, Intragenesis and Cisgenesis: A Brief ReviewTransgenesis, Intragenesis and Cisgenesis: A Brief Review
Transgenesis, Intragenesis and Cisgenesis: A Brief ReviewHuda Nazeer
 
GBS: Genotyping by sequencing
GBS: Genotyping by sequencingGBS: Genotyping by sequencing
GBS: Genotyping by sequencingsampath perumal
 
Comparative and functional genomics
Comparative and functional genomicsComparative and functional genomics
Comparative and functional genomicsJalormi Parekh
 
Marker assisted backcross breeding
Marker assisted backcross breedingMarker assisted backcross breeding
Marker assisted backcross breedingAnilkumar C
 
Single Nucleotide Polymorphism Genotyping Using Kompetitive Allele Specific ...
Single Nucleotide Polymorphism Genotyping Using Kompetitive Allele Specific ...Single Nucleotide Polymorphism Genotyping Using Kompetitive Allele Specific ...
Single Nucleotide Polymorphism Genotyping Using Kompetitive Allele Specific ...MANGLAM ARYA
 
Diversity Array technology
Diversity Array technologyDiversity Array technology
Diversity Array technologyManjesh Saakre
 
Virus induced gene silencing
Virus induced gene silencingVirus induced gene silencing
Virus induced gene silencingBiswajit Sahoo
 
Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1Leighton Pritchard
 
Role of molecular marker
Role of molecular markerRole of molecular marker
Role of molecular markerShweta Tiwari
 
Role of transcriptomics in gene expression studies and
Role of transcriptomics in gene expression studies andRole of transcriptomics in gene expression studies and
Role of transcriptomics in gene expression studies andSarla Rao
 

What's hot (20)

Association mapping
Association mappingAssociation mapping
Association mapping
 
Genomic aided selection for crop improvement
Genomic aided selection for crop improvementGenomic aided selection for crop improvement
Genomic aided selection for crop improvement
 
Genomics and its application in crop improvement
Genomics and its application in crop improvementGenomics and its application in crop improvement
Genomics and its application in crop improvement
 
cisgenesis and intragenesis by Saurabh
cisgenesis and intragenesis by Saurabhcisgenesis and intragenesis by Saurabh
cisgenesis and intragenesis by Saurabh
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Use of SNP-HapMaps in plant breeding
Use of SNP-HapMaps in plant breeding Use of SNP-HapMaps in plant breeding
Use of SNP-HapMaps in plant breeding
 
Transcriptomics: A Tool for Plant Disease Management
Transcriptomics: A Tool for Plant Disease ManagementTranscriptomics: A Tool for Plant Disease Management
Transcriptomics: A Tool for Plant Disease Management
 
Association mapping in plants
Association mapping in plantsAssociation mapping in plants
Association mapping in plants
 
Presentation on Foreground and Background Selection using Marker Assisted Sel...
Presentation on Foreground and Background Selection using Marker Assisted Sel...Presentation on Foreground and Background Selection using Marker Assisted Sel...
Presentation on Foreground and Background Selection using Marker Assisted Sel...
 
Transgenesis, Intragenesis and Cisgenesis: A Brief Review
Transgenesis, Intragenesis and Cisgenesis: A Brief ReviewTransgenesis, Intragenesis and Cisgenesis: A Brief Review
Transgenesis, Intragenesis and Cisgenesis: A Brief Review
 
GBS: Genotyping by sequencing
GBS: Genotyping by sequencingGBS: Genotyping by sequencing
GBS: Genotyping by sequencing
 
Comparative and functional genomics
Comparative and functional genomicsComparative and functional genomics
Comparative and functional genomics
 
Marker assisted backcross breeding
Marker assisted backcross breedingMarker assisted backcross breeding
Marker assisted backcross breeding
 
Single Nucleotide Polymorphism Genotyping Using Kompetitive Allele Specific ...
Single Nucleotide Polymorphism Genotyping Using Kompetitive Allele Specific ...Single Nucleotide Polymorphism Genotyping Using Kompetitive Allele Specific ...
Single Nucleotide Polymorphism Genotyping Using Kompetitive Allele Specific ...
 
Genotyping in Breeding programs
Genotyping in Breeding programsGenotyping in Breeding programs
Genotyping in Breeding programs
 
Diversity Array technology
Diversity Array technologyDiversity Array technology
Diversity Array technology
 
Virus induced gene silencing
Virus induced gene silencingVirus induced gene silencing
Virus induced gene silencing
 
Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1
 
Role of molecular marker
Role of molecular markerRole of molecular marker
Role of molecular marker
 
Role of transcriptomics in gene expression studies and
Role of transcriptomics in gene expression studies andRole of transcriptomics in gene expression studies and
Role of transcriptomics in gene expression studies and
 

Similar to Genome to pangenome : A doorway into crops genome exploration

Report- Genome wide association studies.
Report- Genome wide association studies.Report- Genome wide association studies.
Report- Genome wide association studies.Varsha Gayatonde
 
Roleoffunctionalgenomicsincropimprovement ashishgautam
Roleoffunctionalgenomicsincropimprovement ashishgautamRoleoffunctionalgenomicsincropimprovement ashishgautam
Roleoffunctionalgenomicsincropimprovement ashishgautamAshish Gautam
 
Comparative genomics to the rescue: How complete is your plant genome sequence?
Comparative genomics to the rescue: How complete is your plant genome sequence?Comparative genomics to the rescue: How complete is your plant genome sequence?
Comparative genomics to the rescue: How complete is your plant genome sequence?Klaas Vandepoele
 
Genomics Technologies
Genomics TechnologiesGenomics Technologies
Genomics TechnologiesSean Davis
 
OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009Sean Davis
 
Comprehensive Agrigenomics Solutions
Comprehensive Agrigenomics SolutionsComprehensive Agrigenomics Solutions
Comprehensive Agrigenomics SolutionsKikoGarcia13
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation SequencingAamir Wahab
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicsAthira RG
 
Seminar on Combining Traditional Mutagenesis with New High-Throughput Sequen...
 Seminar on Combining Traditional Mutagenesis with New High-Throughput Sequen... Seminar on Combining Traditional Mutagenesis with New High-Throughput Sequen...
Seminar on Combining Traditional Mutagenesis with New High-Throughput Sequen...surehuasb
 
Pangenome: A future reference paradigm
Pangenome: A future reference paradigmPangenome: A future reference paradigm
Pangenome: A future reference paradigmArunamysore
 
Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...
Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...
Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...DevikaPatel12
 
Advanced genome & epigenome editing tools.pptx
 Advanced genome & epigenome editing tools.pptx Advanced genome & epigenome editing tools.pptx
Advanced genome & epigenome editing tools.pptxberciyalgolda1
 
Sequencing-based Genotyping Assays
Sequencing-based Genotyping AssaysSequencing-based Genotyping Assays
Sequencing-based Genotyping AssaysKikoGarcia13
 
Genomics in animal breeding from the perspectives of matrices and molecules
Genomics in animal breeding from the perspectives of matrices and moleculesGenomics in animal breeding from the perspectives of matrices and molecules
Genomics in animal breeding from the perspectives of matrices and moleculesMartin Johnsson
 
JGI: Genome size impacts on plant adaptation
JGI: Genome size impacts on plant adaptationJGI: Genome size impacts on plant adaptation
JGI: Genome size impacts on plant adaptationjrossibarra
 
Genetic variability and phylogenetic relationships studies of Aegilops L. usi...
Genetic variability and phylogenetic relationships studies of Aegilops L. usi...Genetic variability and phylogenetic relationships studies of Aegilops L. usi...
Genetic variability and phylogenetic relationships studies of Aegilops L. usi...Innspub Net
 
MAGIC population in Vegetables
MAGIC population in VegetablesMAGIC population in Vegetables
MAGIC population in VegetablesAnusha K R
 
Hertweck uva2012
Hertweck uva2012Hertweck uva2012
Hertweck uva2012Kate Hertweck
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation SequencingJoshuaLee309
 
Current trends in pseduogene detection and characterization
Current trends in pseduogene detection and characterizationCurrent trends in pseduogene detection and characterization
Current trends in pseduogene detection and characterizationShreya Feliz
 

Similar to Genome to pangenome : A doorway into crops genome exploration (20)

Report- Genome wide association studies.
Report- Genome wide association studies.Report- Genome wide association studies.
Report- Genome wide association studies.
 
Roleoffunctionalgenomicsincropimprovement ashishgautam
Roleoffunctionalgenomicsincropimprovement ashishgautamRoleoffunctionalgenomicsincropimprovement ashishgautam
Roleoffunctionalgenomicsincropimprovement ashishgautam
 
Comparative genomics to the rescue: How complete is your plant genome sequence?
Comparative genomics to the rescue: How complete is your plant genome sequence?Comparative genomics to the rescue: How complete is your plant genome sequence?
Comparative genomics to the rescue: How complete is your plant genome sequence?
 
Genomics Technologies
Genomics TechnologiesGenomics Technologies
Genomics Technologies
 
OKC Grand Rounds 2009
OKC Grand Rounds 2009OKC Grand Rounds 2009
OKC Grand Rounds 2009
 
Comprehensive Agrigenomics Solutions
Comprehensive Agrigenomics SolutionsComprehensive Agrigenomics Solutions
Comprehensive Agrigenomics Solutions
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Seminar on Combining Traditional Mutagenesis with New High-Throughput Sequen...
 Seminar on Combining Traditional Mutagenesis with New High-Throughput Sequen... Seminar on Combining Traditional Mutagenesis with New High-Throughput Sequen...
Seminar on Combining Traditional Mutagenesis with New High-Throughput Sequen...
 
Pangenome: A future reference paradigm
Pangenome: A future reference paradigmPangenome: A future reference paradigm
Pangenome: A future reference paradigm
 
Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...
Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...
Functional Genomic l Genomes l proteomic l DNA l #genomics #proteomics #scien...
 
Advanced genome & epigenome editing tools.pptx
 Advanced genome & epigenome editing tools.pptx Advanced genome & epigenome editing tools.pptx
Advanced genome & epigenome editing tools.pptx
 
Sequencing-based Genotyping Assays
Sequencing-based Genotyping AssaysSequencing-based Genotyping Assays
Sequencing-based Genotyping Assays
 
Genomics in animal breeding from the perspectives of matrices and molecules
Genomics in animal breeding from the perspectives of matrices and moleculesGenomics in animal breeding from the perspectives of matrices and molecules
Genomics in animal breeding from the perspectives of matrices and molecules
 
JGI: Genome size impacts on plant adaptation
JGI: Genome size impacts on plant adaptationJGI: Genome size impacts on plant adaptation
JGI: Genome size impacts on plant adaptation
 
Genetic variability and phylogenetic relationships studies of Aegilops L. usi...
Genetic variability and phylogenetic relationships studies of Aegilops L. usi...Genetic variability and phylogenetic relationships studies of Aegilops L. usi...
Genetic variability and phylogenetic relationships studies of Aegilops L. usi...
 
MAGIC population in Vegetables
MAGIC population in VegetablesMAGIC population in Vegetables
MAGIC population in Vegetables
 
Hertweck uva2012
Hertweck uva2012Hertweck uva2012
Hertweck uva2012
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
Current trends in pseduogene detection and characterization
Current trends in pseduogene detection and characterizationCurrent trends in pseduogene detection and characterization
Current trends in pseduogene detection and characterization
 

Recently uploaded

Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Call Us ≜ 9953322196 ≌ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≜ 9953322196 ≌ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≜ 9953322196 ≌ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≜ 9953322196 ≌ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...SĂ©rgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSĂ©rgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSĂ©rgio Sacani
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...SĂ©rgio Sacani
 

Recently uploaded (20)

Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Call Us ≜ 9953322196 ≌ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≜ 9953322196 ≌ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≜ 9953322196 ≌ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≜ 9953322196 ≌ Call Girls In Mukherjee Nagar(Delhi) |
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 

Genome to pangenome : A doorway into crops genome exploration

  • 1. GENOME TO PANGENOME : A doorway into crop’s genome exploration KIRAN K.M PGS20AGR8449 Department of genetics and plant breeding MASTERS SEMINAR 1 UNIVERSITY OF AGRICULTURAL SCINECE, DHARWAD
  • 2. Our journey through
 How to capture this information? : Birth of pangenome Introduction: How genome assisted crop improvement works and what sort of information is missing from this approach ? How much important this “missing information” is ? Does the information's mined from pangenome oriented GWAS are worthy? How to represent and analyze pangenome effectively to dugout new sort of information? What are application and future perspective of pan-genome oriented crop improvement ? Is pangenome the end of a story? : Conclusion
  • 3. Entire genic/allelic variant forms within a species Domestication Ecotype differentiation Selection pressure Birth or death of some genes/modification via deletion, duplication, transposition etc.
  • 4. SINGLE-REFERENCE GENOME Single reference genome oriented Comparative genome analysis What if our reference genome is incomplete to capture whole information's ? We need to capture entire genetic diversity of species : Doorway into single reference-free pan-genome analysis
  • 5.
  • 7. “Boosting-up” of crop improvement programs Genomic data derived from multiple accessions and cultivars Full extent of sequence variations within a species PAN-GENOMIC approach to figure out new genes and alleles directly related to phenotype “A pangenome refers to the full complement of genes of a biological clade, such as a species, which can be partitioned into a set of core genes that are shared by all individuals and a set of dispensable genes that are partially shared or individual speciïŹc.”
  • 8. HervĂ© Tettelin Duccio Medini ✔ Pangenomes were first introduced by Tettelin et al., to describe gene diversity in Streptococcus agalactiae. Michele Morgante ✔ Pangenomics in plants was first proposed by Morgante et al., ✔ 2014- First crop plant genome in - Soyabean (Glycine max)
  • 10. Extensive structural variants (SVs) Presence-absence variation (PAV) Copy number variation (CNV) Chromosomal rearrangements
  • 11. Origin of SV’s Recombination errors, Non-allelic homologous recombination (NAHR) Replication errors Microhomology-mediated break- induced replication (MMBIR) DNA break repair errors eg: Non-homologous end joining (NHEJ) Non-reciprocal exchanges. Homoeologous non-reciprocal transpositions (HNRT) Replication errors Fork stalling and template switching (FoSTeS) PNV Gabur et al., 2018 CNV Polyploidization and/or Whole-genome duplication Structural variations (SV’s) : The un-tapped genetic potential
  • 12. ❑Resistance to biotic stress ⼚Rhg1 locus –resistance to cyst nematode(soybean) ⼚Absence of sulfotransferase gene in PAVs with various sizes- resistance to striga (Sorghum) ⼚ Deletions in the Pi 21gene results in quantitative and durable resistance against blast disease(Rice) SVs affecting complex agronomic traits
  • 13. ❑Resistance to abiotic stress ⼚ PAVs Sub1A gene (Xu et al. 2006) encoding ERF like genes-submergence tolerance&2 ERF genes SNORKEL 1&SNORKEL 2 (Hattori et al. 2009) - deep water response(Rice) ⼚ Tolerance of phosphorus starvation at Pup1 locus –attributed to the presence of a receptor like cytoplasmic kinase gene PSTOL1(rice) ❑Plant architecture ⼚ Extra copy of Rht-D1b resulting from a duplication of a >1Mb region causes >70%reduction in plant height (wheat) ❑Yield and grain quality ⼚ 1212-bp deletion 5 kb downstream of the GW5 gene causes variation of grain width and grain weight(Rice) ❑Flowering Time ⼚ A CNV at the HvFT1 locus was found to be associated with flowering time variation (Barley)
  • 14. Compared with the entire pan-genome, genes in the flexible genome were significantly enriched with those involved in biological processes, such as defense response, photosynthesis and biosynthetic processes
  • 15. Dynamics of pan-genome compartments ⼚Gene birth and death processes ‱ Errors during recombination ‱ De novo genes ‱ Duplication followed by rapid divergence, neo-functionalization ⼚Transposable elements ‱ In maize, helitron TE activity can modify 50% of the genome structure. ⼚Horizontal transfers  Conjugation  Transduction  Transformation (Christine et al.,2019)
  • 16. Pan-genome construction and assembly methods de novo sequencing and comparison iterative mapping and assembly Graph-based approaches 1. Sequence and variation graphs (VGs) 2. Practical haplotype graphs (PHG) â–Ș Errors in assembly and annotation may lead to the false calling of variation â–Ș Costly, requires high-quality data with high sequencing coverage â–Ș Limiting the application to relatively few individuals Start with, Single reference genome as a base for the pangenome Whole genome sequence data for multiple individuals is aligned to the reference genome Non-aligning sequence reads are assembled and added to the reference to build a pangenome MULTIPLE GENOMES
  • 17. Reference guided assembly De-novo genome assembly (Danilevicz, 2020)
  • 18. (Eizenga et al., 2020) Pan-genome graphs ‱ To represent the sequence content and the corresponding functional annotation of an entire population, species, or a clade ‱ Here compressing redundant sequences into smaller data structures while retaining information on genomic diversity and whole-genome relationships
  • 19. Ideally, a complete and fully annotated pangenome graph would integrate genomic, epigenomic, and transcriptomic datasets, thus facilitating downstream functional and comparative analyses
  • 20. Genome region harboring CsFT locus among six cucumber accessions 44.0 kb complex insertion 25.3 kb complex insertion 39.3 kb canonical insertion Li et al 2022, Nature communication Variation Graph (VG)
  • 21. Practical haplotype group (PHG) database and haplotype creation/ Trellis graph representing genic and intergenic regions. Jensen et al 2020, The plant genome PHGs is to determine which haplotypes or genotypes of parental haplotypes that have been sequenced at high coverage are present in progeny that have been sequenced at low coverage
  • 22.
  • 23. ❖ Extrapolation of pangenome size leads to a ‱ Predicted pangenome of 63,865±31 genes (37,766±62 gene families ‱ Predicted core genome size of 49,740±164 genes (28,496±91 gene families Model describing the sizes of core and pangenome. Orthologous gene clusters 61,379 genes All genes 35,853 gene families 49,895 genes 28,532 gene families A. G Agnieszka et al, 2016 Nature communications DOI: 10.1038/ncomms13390
  • 24. How many genomes to capture the whole genome content? ✔The core/pangenome ratio below 85% shows huge adaptation. ✔In plants, core genome represents from 40-80% of the total pangenome. Munir et al.,2020
  • 25. Power-law regression for new genes The numbers n of new genes are plotted for increasing values of the number N of genomes sequenced. (Tettelin et al. 2008) ✔Blue curves are least-squares fit of the exponential function, as in the original pangenome model. ✔Red curves are least-squares fit of the power-law function. Open pan genome Closed pan genome Closed (α > 1) Open (α < 1)
  • 26. Software / Tool Description / Role URI link PanSeq Extract the regions unique in the genome, Identify the SNPs and construct the file for phylogeny programme https://lfz.corefacility.ca/panseq/ PanFunPro x Homology detection and pairwise genome analysis in pan/core genome. https://zenodo.org/record/7583#.YTR3 6p0zY2w PGAP Detection of homologous genes, orthologous genes, SNP, phylogenetic studies, pangenome plotting and functional annotation. http://pgap.sf.net PanACEA Identification of genomic regions those are phylogenetically dissimilar. https://github.com/JCVenterInstitute/P anACEA PGAP-X Genome diversity and visualize genome structure and gene content to understand the evolution. http://pgapx.ybzhao.com/ PAN2HGENE. To identify new products, resulting in altering the α value behavior in the pangenome without altering the original genomic sequence. https://sourceforgenet/projects/pan2h gene-software BGDMdocker. For pangenome analysis, visualization, clustering and genome annotation https://www.docker.com/whatisdocker Aggarwal et al., 2022 Tools – pangenome analysis
  • 27.
  • 28. MATERIALS AND METHODS Pan-Genome Assembly and Annotation Gene Presence-Absence Variations(gPAVs) SNP Discovery and Annotation Sorghum Diversity and Population Structure Genome-Wide Association Analysis (GWAS) : Two different mapping populations having the phenotypic data of 10 traits Drought RNA-Seq Assay Analysis
  • 29. Pan-Genome Assembly and Annotation Iterative mapping and assembly approach Start with Sorghum reference assembly v3.0.1 and adds on whole genome sequence data iteratively Compared the aligned sequence with NCBI non-redundant nucleotide databases BLASTn and the sequences with homology to sorghum mitochondria, chloroplast sequences and homology with Viridiplantae taxonomy (outside the green plant group) Remove these homologous sequences Unmapped reads were assembled Sorghum reference assembly v3.0.1. Reads from 176 sorghum accessions with a minimum of 10X coverage sequence data were mapped to the sorghum reference v3.0 (The assembled contig sequence more than 500 bp length was only considered and appended to reference genome sequence.) Bowtie2 v2.3.4, IDBA_UD assembler,
  • 30. REPEATMASKER-v4.0.7 Masked repetitive elements (>90 percent coverage with greater 90 percent identity) using sorghum as the species AUGUSTUS v3.3.2 The sorghum expressed sequence tags (ESTs) from GenBank were aligned with tBLASTx and genes were prediction with homology and ab initio method On an average each iteration of the process added 1.9Mb 263.7 Mbp 89.2Mb removed RNA-Seq mapping hints from the 25 accessions used for combining-evidence based ab initio gene prediction and the 3,589 genes supporting the mapped expressed sequence tags (EST) sequences were retained. Identified 11,057 to 17,616 variable genes in the 176 genomes, ✔ Average gene sequence length- 1,567 bp ✔ Average exons per gene - 3.6
  • 31. Sorghum Pan-Genome Gene PAV (gPAV) R “ape” package to construct an NJ (neighbour joining) tree Whole-genome sequence reads of all 354 sorghum accessions were mapped with Bowtie2 v2.3.4 Genes models on contigs longer than 1 Kbp were used in this analysis. Clustal analysis all-by-all BLASTp followed by MCL The gene enrichment analysis R “topGO” package- using “Elim” method. SNP Discovery and Annotation whole genome sequence reads of 354 accessions were quality trimmed using Trimmomatic Bowtie2 v2.3.4 – Paired reads mapping with pangenome Picard tools- To filter out read duplication SNP functionally annotated with SnpEff v.4.3 Variants against the reference (pan- genome) were called with GATK v.4.1
  • 32. â–Ș Closed type pangenome- 35,719 Genes ‱ 18,898 variable genes â–Ș 30 genes  uniquely present â–Ș 3,183 (8.9%)  uniquely absent 35719 16821 (47%) RESULTS â–Ș Total- 2 million SNPs ; 91319 SNP’s in extra contig assembly â–Ș Variable gene length is shorter with few exons â–Ș Variable genes have fewer synonymous SNPs and similar non-synomymous SNPs compared to core gene
  • 33. A - Reference whole genome sequence reads mapping B - Drought expression (RNASeq) sequence mapping density C - Gene density D - Genes commonly present in all accessions (core genes) E - genes absent in at least one of the accessions (variable genes) F - SNP density G - Insertions and deletions (indels). SNP density ‱ Extra contig : 0.52/Kbp ‱ Rest : 2.71/kb ‱ 210,805 Contigs ‱ Minimum contig length : 500 bp
  • 34. gPAVs- based neighbour-joining tree with Histogram bars â–Ș Among the 35,719 total genes, 53% exhibited the genic variations to estimate the relationship among the accessions ✔ The largest number of genes â–Ș uniquely present : Macia (9 genes) â–Ș uniquely absent : PI660645 (372 genes) This indicated the evolutionary distance from other accessions
  • 35. The Ka/Ks ratio estimating the balance between neutral mutations, purifying selection, and beneficial mutations on a set of core and variable genes Distribution of Infinium SNP array markers on chromosome
  • 36. Principal co-ordinate analysis ✔ Three different clusters with one of them having two groups (Caudatum and Kafir ) ✔ Durra and guinea sorghum races displayed identifiable clusters ✔ Caudatum and Kafir accessions exhibited the admixtures
  • 37. â–Ș Total of 111 genes among total variable genes are race-specific â–Ș unique genes from durra associated with, ✔ Heat shock protein, LRR repeat protein, L-type lactin-domain receptor, ABC transporter family proteins, and Ras-related proteins. â–Ș Guinea group unique genes associated with ✔ disease resistance protein, betaglucosidase proteins, NRT1/PTE protein family, etc., GENE CLUSTER ANALYSIS Identified ⼚ 11,470 gene families ⼚ Un-clustered genes (6,057) â–Ș 556 from the non-reference genes and the remaining 5,501 from reference genes. Specific and common genes across races
  • 38. Genome-Wide Association Analysis (GWAS) with two different mapping populations having the phenotypic data of 10 traits POPULATION 1 POPULATION 2 â–Ș A subset of 227 accessions from the 354 WGS set belonged to four major races of sorghum having representation from Africa, Asia, and America was use â–Ș The phenotype and genotype data associated with ⼚ Plant height (PH), ⼚ Dry biomass (DBM), and ⼚ Starch (ST) â–Ș The stay-green fine-mapping population developed by crossing an introgression line cross RSG04008-6 × J2614-11 was used for association study using the pan-genome assembly ⼚ Green leaf area (GLA) ⼚ Glossy (GL) leaf ⼚ Sheath pigment (LSP) ⼚ Plant vigor (V), ⼚ Trichome low (TL), ⼚ Trichome up (TU), ⼚ Soot fly dead hearts (SFDH) Pan-genome helps identifying novel genes 1
  • 39. Significant association of SNP’s for plant biomass on chromosome 9
  • 40. Significant association of SNP’s for plant height on extra-contigs ✔ From Population 1 : A total of 36 SNPs on extra contigs found associated with target traits. Among them, 10 SNP 25 SNP 1 SNP ✔ From Population2 : Trait Green leaf area (GLA) significant association with five SNPs on extra-contigs Starch (ST) Dry bio mass (DBM) Plant height (PH)
  • 41. Identification of the Drought Candidate Genes â–Ș A sorghum RNASeq data generated from â–Ș 79 out of 1,788 total drought responsive genes(differentially expressed genes) were reported from genes on assembly sequence (extra- contig). Drought-resistant Susceptible BTx623 (DR1) Tx7000 (DS1) SC56 (DR2) PI482662 (DS2)] 6 hr- Treatment , DR 1 Data-set :14 (13 up and 1 downregulated) and DR 2 data-set : 34 (31 up and 3 down-regulated) genes from novel sequence were expressed
  • 42. â–Ș Over-all, Five drought-related genes were co-mapped with the trait- associated genes. Among this, Two drought resistance specific genes Sobic.005G069800 and Sobic.006G127800 were linked to Plant height and Sheath pigment (LSP) traits. DR 1 Data-set DR 2 Data-set Venn diagram
  • 43. Functional consequences of new transposable element insertions Possible effects on gene product structure Transposable elements (TEs) , a driver of structural variation
  • 44. The TE insertions were shown to be associated with changes in methylation, chromatin accessibility and potentially regulatory functions Possible effects on gene product abundance
  • 45. TEs as novel regulatory elements TEs carrying ACRs are enriched for association with higher expression of nearby genes, indicates their role as novel regulatory elements (a) Insertions of transposons into genes/regions of accessible chromatin regions (ACR’s) or regulatory elements Might often result in reduced expression of nearby genes or altered patterns of expression (b) Insertion of TE’s that contain ACR’s Might act as mobile enhancers that affect the expression of both the TE promoters and nearby gene promoters ie re-wiring of transcription of nearby genes (Noshay et al., 2020)
  • 46. Pangenome : A tool to unveil the hidden role of Transposable elements(TE’s) in crop evolution
  • 47. Eight high-quality genomes reveal pan-genome architecture and ecotype differentiation of Brassica napus VOL 6 | 2020
  • 48. SNP-based GWAS versus PAV-based GWAS : case study for silique length(SL), seed weight (SW) and flowering time in Brassica napus. Manhattan plots of SNP-GWAS and PAV-GWAS for silique length. GWAS (-lmmm 1: Wald test) was performed with 3,971,412 SNPs or 27,216 PAVs in the BN-NAM population containing 2,141 RILs.
  • 49. Although the peak SNP on chromosome A09 fell within the previously reported region identified by traditional quantitative trait locus mapping and positional cloning. ‱ None of the associated SNPs was located in the regulatory region or coding sequence of the target gene BnaA9.CYP78A9 ‱ Encouragingly, PAV-GWAS directly detected the 3.9-kb CACTA-like TE inserted upstream of the BnaA9.CYP78A9 promoter region(P450 monooxygenase), which was identified as the causal variation for SL and SW Phenotype data of silique length in eight B. napus accessions. ‱ Experiments were repeated five times with similar results. Phenotype data of seed weight in eight assembled B. napus accessions. A 3.6-kb CACTA-like insertion as lead PAV of BnaA09.CYP78A9 promoter region.
  • 50. Pangenome Revealing secret of niche specific fitness 2 Tapidor Quinta Gagan ZS11 Shengli Zheyou7 Westar No2127 Winter type (WORs) Semi-winter type (SWORs) Spring type (SORs) Eight B. napus accessions Neighbour-joining tree of 210 B. napus accessions, eight assembled accessions and 199 B. rapa accessions
  • 51. Insertions of four transposable elements around BnaA10.FLC in different ecotypes ✔ Validated these TEs in 210 B. napus accessions (141 of which had ecotype information) The role of FLC genes in the divergence of the three rapeseed ecotypes SWORs WORs SORs
  • 52. ✔Due to the LINE insertion in the first exon of BnaA10.FLC, the loss-of-function mutation makes SORs require weak or no vernalization. ✔An 824-bp hAT insertion in the last exon of BnaA02 FLC was identified as the lead PAV by PAV-GWAS in SOR (Spring)Type ✔The MITE insertion in the promoter region of BnaA10.FLC enhances the expression of BnaA10.FLC which leads to a requirement of strong vernalization for WORs. ✔A demand for vernalization of SWOR is somewhere between the other two ecotypes due to the hAT insertion in the promoter region of BnaA10.FLC CONCLUSIVE RESULTS
  • 53. Indicating a strong correlation between specific TE insertions in BnaA10.FLC and ecotype classification Haplotypes of six SNPs and the three TEs located within the 5.0-kb upstream and downstream regions and the coding sequence of BnaA10.FLC Pangenome uncover potentiality of Transposable elements(TE’s) as powerful molecular markers 3
  • 54. CROP Transposable elements Associated trait Maize A Harbinger-like DNA transposon Represses the expression of the ZmCCT9 gene to promote flowering under long-day conditions Rice A Gypsy retrotransposon Enhance the expression of the OsFRDL4 gene and promote aluminum tolerance Tomato Two Copia retrotransposons independently inserted into the promoter region of the orange Ruby gene Enhanced expression and driving convergent evolution of the blood orange trait maize Ac/ fAc ( hAT family element) transposon Induce expression of pericarp color 2 gene (p2) by capturing the enhancer sequence of another gene
  • 55. The tomato pangenome un-covers new genes and a rare allele regulating fruit flavor 4 (Gao et al., 2019). Pangenome un-covers rare alleles
  • 56. ‱ Solanum pimpinellifolium (SP) ‱ Solanum cheesmaniae ssp galapagense (SCG) ‱ Solanum lycopersicum L. var. cerasiforme (SLC) ‱ Solanum lycopersicum L. lycopersicum (SLL) Phylogenetic and principal component analyses (PCA) using the PAVs suggested that wild accessions clearly separated from domesticated accessions with only a few exceptions, and the two domesticated groups (SLC and SLL) separated but with clear overlaps Violin graph Principal component analyses (PCA)
  • 57. “Who will last in the Run?” Scatter plots Gene selection preference during tomato domestication and improvement
  • 58.
  • 59. A rare promoter allele that modifies fruit flavor Pan-genome analysis ~4-kb substitution in the promoter region of TomLoxC (Solyc01g006540) Encodes a 13-lipoxygenase essential for C5 and C6 green-leaf volatile production in tomato fruit 4,151-bp nonreference allele of the TomLoxC promoter captured in Pan-genome Rare allele in cultivated tomatoes that reflects strong negative selection during domestication. ‱ TomatoPan028690Truncated part of a fruit weight gene -Cell Size Regulator (CSR) was detected in- All SP, 88.6% of SLC and 14.4% of SLL heirlooms. ✔This supporting that the deletion allele arose during domestication and has been largely fixed in cultivated tomatoes. Human selection influenced fruit quality or additional phenotypes in some instances by targeting regulatory sequences
  • 60. S. pimpinellifolium SP (47.4%) Modern SLL cultivars (7.2%) All heterozygotes S. cheesmaniae SLC (8.4%) SLL heirlooms (1.1%), Most likely because of recent introgressions from wild into cultivated tomatoes. consistent with its selection during modern breeding, possibly the consequence of selecting lines with superior stress tolerance in agricultural settings The frequency of the non-reference allele ✔ Expression levels of TomLoxC in orange-stage fruit of accessions with different promoter alleles ✔ Heterozygous TomLoxC promoter genotypes have the strongest expression in orange-stage fruit.
  • 61. 5 Pangenome helps trace back to domestication trajectory
  • 62. Nature | Vol 588 | 10 December 2020 ‱ Constructed chromosome-scale sequence assemblies for 20 accessions ‱ Paired-end and mate-pair Illumina short reads were assembled into scaffolds ‱ Chromium linked-reads and chromosome conformation capture (Hi-C) data to arrange scaffolds into chromosomal pseudomolecules using the TRITEX pipeline ‱ Use single-copy pan-genome for genetic analysis in a wider diversity panel -single-copy regions extracted from each of the 20 assemblies and clustered into a non-redundant set of sequences
  • 63. Translate single-copy sequences variation into scorable markers which are amenable to population genetic analysis and association scans Genome-wide association scan for lemma adherence on the basis of PAV markers Lemma adherence covered - NUDUM (NUD) gene INFERENCE All varieties of naked barley are thought to trace back to a single mutational event, deleting the entire NUD sequence How much significant a single-copy pangenome is ?
  • 64. Mapping of polymorphic inversions in population Objective : To discover inversions in a broader set of germplasm ‱ Hi-C-based inversion scans on Hi-C data of a diversity panel mapped to a single reference genome ✔Among 69 barley genotypes (67 domesticated and 2 wild accessions) revealed a total of 42 events that ranged from 4 to 141 Mb in size (mean size of 23.9 Mb) ✔A notable finding was the prevalence of large (more than 5 Mb in size) inversion polymorphisms in current elite germplasm 6 Mapping of polymorphic inversions in population
  • 65. Identification and characterization of a large inversion on chromosome 7H 1. RGT Planet (Inversion carrier) × Hindmarsh  (R × H) 2. Morex × Barke (M × B) Mapping population  Earliest cultivar that carried the inversion was Diamant. As one of the donors of the semi-dwarf growth habit
  • 66. This strongly suggests that mutation breeding in the 1960s has given rise to a cryptic large inversion, which—unbeknownst to breeders—segregates in elite varieties of barley INFERENCE ‱ Map of inversion polymorphisms will provide breeders with a point of reference to avoid or interpret correctly the crosses between carriers and non- carriers.  Diamant -Highly influential founder line of modern barley breeding and traces back to a mutant induced by gamma irradiation of the Czech cultivar Valticky.  Gene bank/ Germplasm study : None of the Valticky samples carried the inversion, whereas it segregated in the Diamant samples
  • 67. Expanding Gene-Editing Potential in Crop Improvement with Pangenomes Identify non-recombinant inversions in pangenome- High precision identification of chromosomal re-arrangement boundaries CRISPER Protein complex Induce inversion- Re-inversion Genes locked in the region is accessible to recombine in population Reversal of inversion through CRISPR tech. allow crossing genes in inverted regions (Fernandez, 2022) 7
  • 68. ✔CRISPR-Cas can be used to study the effect of gene dosage by generating a series of allelic mutants through knock-out/down mutation of specific variant alleles Eg : Pleotropic effects of mlo gene (barly) against powdery mildew ✔Potential benefits of using pangenome reference for genetic modification, as 1. The genetic diversity analysis can be helps to identify potential target site for genome editing 2. Identify CNV that influence CRISPR-Cas mutation effectiveness 3. identify novel target alleles and map their position on pan-genome 4. Avoid off-target effect in multiplex editing by designing specific sgRNAs Thus supporting accurate and specific guide RNA design
  • 69. Crop pangenome Reference Maize pangenome (66 inbred lines) ✔ Identified inveresions Largest inversion spanning 75.5 Mb in the pericentric region of chromosome 2 Schwartz C et al., 2020, Nature plants cotton pangenome (890 accessions ) Meta-GWAS and gene expression analysis – Gene knockout with CRISPR-Cas9: ✔ Identified previously uncharacterized gene GhIDD7 subsequently shown to control fibre length Li et al., 2021, Genome Biol. Rice pangenome (66 accessions) “Green revolution phenotype” ✔ Identified 129 conserved gene loci ✔ CRISPR-Cas knock-out/down study:- uncovered 31 high yield-related genes, including six previously reported genes such the sd1 semi-dwarf gene Huang j et al., 2018
  • 70. Role of Cis-regulatory elements (CREs) and their Pan- genome identification for fine tuning of gene expression ❑ The CREs are noncoding DNA sequences capable of recruiting transcription factors and affecting gene expression ❑ The CREs can be broadly subdivided into promoters and enhancers or silencers (Zanini et al., 2021) 8
  • 71. Genome editing of cis-regulatory elements: a hypothetical scenario of editing of Brassica napus CLV3 homologues’ cis-regulatory elements to generate multiocular siliques and range of variation in seed number. Brassica napus has two, mostly redundant, copies of BnCLV3, so editing of both would be necessary (Xu et al., 2021; Yang et al., 2018)
  • 72. Importance of pan-genomics as approach to explaining heterosis ❖ Pan-genomics can play an important role in unraveling gene members and families contributing to heterosis, according to the proposed model ❖ A new gene and variant finding is essential to explaining and utilizing heterosis for crop improvement. A model of heterosis proposed by Swanson-Wagner et al., 10
  • 73. Pan-genome : A resource to explore the Breeding Potential of Under-utilised Crop Species Guava Investigate fruit and leaf metabolites and fruit aroma volatiles of 27 guava accessions .These datasets could be used to scan a guava pangenome for fruit related traits Integrating rich phenotype data A super-pangenome of yam bean species (P. erosus, P. ahipa and P. tuberosus Helps to infer the effects of SVs on phenotype, including traits directly related to plant performance such as ✔ Day to flowering and maturity, ✔ plant height ✔ root biomass By developing resources for under-utilised crops, novel genes related to agro-morphological traits can be detected and used to inform breeding programs or used for introgression 11
  • 74.
  • 75. CURRENT STATUS AND FUTURE ASPECTS OF PANGENOMIC STUDIES
  • 76. A summary of plant pangenome studies.
  • 77. Species Single reference size Pangenome size Traits studied using the pangenome Variant type Reference Brassica oleracea, B. macrocarpa (cultivated and wild cabbage) (Bo TO1000) 488 Mb; 59,225 gene 587 Mb; 61,379 genes Disease resistance, flowering time, secondary metabolites PAV Golicz, et al. 2016 Cajanus cajan (pigeon pea) (Asha) 606 Mb; 53,612 genes 622 Mb; 55,512 genes Self-fertilization, disease resistance, seed weight SNP, PAV Zhao et al., 2020 Glycine soja (wild soybean) (GsojaD, Shandong) 985 Mb; 57,631 genes 986.3 Mb; 59,080 gene families Disease resistance, flowering time, oil content, height and lodging, yield CNV, PAV, SNP, InDel Li et al., 2014 Gossypium hirsutum (upland cotton) (TM-1) 2,347 Mb; 70,199 genes 3,388 Mb; 102,768 genes Flowering time, morphology, yield, fiber traits PAV, SNP Li et al., 2021 Oryza sativa (Nipponbare) 384 Mb Indica- 52976 Disease, stress resistance, grain width and size SNP Yao et al., Zea mays (maize) B73, 2,182 103,538 genes Disease resistance, flowering time PAV,TE insertion Hufford et al., 2021
  • 78. Pan-genome Array (RPGA): an efficient genotyping solution for pan-genome-based accelerated crop improvement in rice Anurag Daware , Ankit Malik , Rishi Srivastava , Durdam Das , Ranjith K. Ellur , Ashok K. Singh , Akhilesh K. Tyagi and Swarup K. Parida ✔ “Rice Pan-genome Genotyping Array (RPGA)” is a first-ever pan-genome- based SNP genotyping assay developed for crop plants ✔ Efficiently capture haplotype variation from the entire 3K rice pan-genome representing diverse population (Indica, Tropical/Temperate japonica, aus and Aromatic, etc.) ✔ RPGA assays total of 80504 SNPs including 60026 SNPs from 12 Nipponbare chromosomes and 20478 SNPs from 12 pseudo127 chromosomes of 3K rice pan-genome. (2022)
  • 79. ‘RICE PAN-GENOME GENOTYPING ARRAY’ ANALYSIS PORTAL(RAP) http://www.rpgaweb.com 3K Rice Reference Panel and subsequent GWAS
  • 80. “Super-pangenome is the approach of developing a pangenome of the pangenomes of different species for a given genus”. Super-pangenome: A way forward
  • 81. Khan,W.et al.,2020 Approaches for the construction of super pangenome
  • 82.  Super-pangenomes support the breeding of crops better adapted to diverse environments and more resilient to climate change by analyzing gene frequency change during domestication/ evolution Super-pangenome study involiving polyploid Brassica napus and its two diploid progenitor genomes gives, ‱ Comparative modelling of the gene loss propensity in diploid and polyploid Brassica sp.  Diploids- Primarily associated with transposable elements  Polyploid, B. napus - Associated with homoeologous recombination.  Identification of beneficial haplotypes that could be introgressed through conventional breeding (Bayer et al 2021, Plant Biotechnology Journal)
  • 83.  Tomato super-pangenome identified functional polymorphisms in the genes associated with fruit flavour(LIN5, ALMT9, AAT1, CXE1, and LoxC ).  Cotton super-pangenome give knowledge on Genomic diversity among five polyploids and their monophyletic origin  Polyploidy genomes are conserved in gene content and synteny  Diversified by sub-genomic transposon exchanges that equilibrate genome size, Evolutionary rate, and positive selection between homeologs within and among lineage  The super-pangenome of banana identified Gene differences between Musa and Ensete genera , as well as 12,310 new gene models in the species, forming distinct PAV clusters between the Ensete and Musa accessions (Chen et al., 2020 Nature genetics)
  • 84. APPLICATIONS OF PANGENOME 1. Finding novel genes 2. Revealing niche specific fitness 3. Evolution, Domestication and Breeding History 4. Helps to identify potential target site for genome editing 5. Facilitating taxonomic identification 6. Approach to explaining heterosis 7. Elucidating host pathogen interaction 8. Strengthening proteogenomic
  • 86. Beyond pan-genome ? Pan-Transcriptome Potent bioinformatic tools Pan-Metabolome Pan- Epigenomes
  • 88.  Comparative modelling of the propensity for gene loss in the three species revealed that in the diploids, genes with propensity for loss are primarily associated with transposable elements, while in the polyploid B. napus, propensity for gene loss was associated with the position of the gene on the pseudomolecule  Studying how genes change in frequency between domesticated crops and their wild relatives using
  • 89. Rapeseed (Brassica napus) reference genomes, Two Winter type oilseed rapes (WORs) (Darmor-bzh2 and Tapidor8) Two semi-winter oilseed rapes (SWORs (ZS11 and NY7) Genome-wide comparative analysis of eight well-assembled genomes and the Darmor-bzh genome and identified the coregene clusters, dispensable gene clusters and specific gene clusters. Created by, ⼚ ZS11 de novo assemblies using PacBio, Hi-C and Bio-Nano data ⼚ Other seven accessions were obtained by integrating high-coverage PacBio and Illumina data; two of them were verified by Hi-C or BioNano data. Multiple high-quality reference genomes representing different ecotypes are necessary for a better understanding of the genome structure and genetic basis of morphotype Materials and methods
  • 90. GWAS of flowering time in the Nested association mapping (NAM) population. Manhattan plots for flowering time analyzed by SNP-GWAS in winter and spring environments, respectively. Manhattan plots for flowering time analysed by PAV-GWAS in winter and spring environments, respectively.. GWAS (-lmmm 1: Wald test) was performed with 3,971,412 SNPs or 27,216 PAVs in the BN-NAM population containing 2,141 RILs.
  • 91. Insertions of four transposable elements around BnaA10.FLC in different ecotypes ✔ Four TEs were identified in the promoter and coding region of BnaA10.FLC ✔ Validated these TEs in 210 B. napus accessions (141 of which had ecotype information) The role of FLC genes in the divergence of the three rapeseed ecotypes. RESULTS ✔All the WORs contained the MITE insertion ✔ 85% (22/26) of the SORs contained the LINE insertion ✔81% (80/99) of the SWORs contained the hAT insertion SWORs WORs SORs
  • 92. Flowering time of lines with different BnaA02.FLC alleles in spring & winter respectively. spring Spring Winter
  • 93. An 824-bp hAT insertion in the last exon of BnaA02 FLC was identified as the lead PAV by PAV-GWAS. SOR (Spring)Type
  • 94. ✔Due to the LINE insertion in the first exon of BnaA10.FLC, the loss-of- function mutation makes SORs require weak or no vernalization. ✔The MITE insertion in the promoter region of BnaA10.FLC enhances the expression of BnaA10.FLC which leads to a requirement of strong vernalization for WORs. ✔A demand for vernalization of SWOR is somewhere between the other two ecotypes due to the hAT insertion in the promoter region of BnaA10.FLC ✔An 824-bp hAT insertion in the last exon of BnaA02 FLC was identified as the lead PAV by PAV-GWAS. CONCLUSIVE RESULTS SOR (Spring)Type
  • 95. 824 bp hAT insertion in the last exon of BnaA02 FLC BnaA02.FLC has a stronger flowering repression effect than BnaC02.FLC47
  • 96. BnaA02.FLC has a stronger flowering repression effect than BnaC02.FLC Both possess BnaA10.FLC Tapidor Quinta Winter type (WORs) Tapidor Two copies of BnaA02. FLC One copy of BnaC02.FLC Quinta This may be cause the difference in flowering time between them One copy of BnaA02.FLC Shengli Zheyou7 Tapidor BnaC02. FLC gene is replaced by BnaA02.FLC BnaA02.FLC was not expressed in any stage + one functional BnaC02.FLC No2127 Gangan Westar Gene BnaC02. FLC is completely absent
  • 97. ✔ The cumulative expression levels of three FLC genes and the flowering time characterization of eight assembled B. napus accessions associated with PAVs and copy number, among the eight accessions Stacked histogram showed FLCs expression in T0–T3 24 (T0) ; 54 (T1); 82 (T2); 115 (T3) DAYS AFTER SOWING Spring type (SOR) Semi winter type Spring type (SOR)
  • 98.
  • 99. ‱ Among the unfavorable genes, seven were not full length. ‱ TomatoPan028690Truncated part of a fruit weight gene -Cell Size Regulator (CSR) was detected in- All SP, 88.6% of SLC and 14.4% of SLL heirlooms, ✔This supporting that the deletion allele arose during domestication and has been largely fixed in cultivated tomatoes. Selection of promoter PAVs during tomato breeding. A total of 90,929 nonreference contigs 3,741 nonreference sequences localized in putative promoter regions 980 promoter sequences under selection Checked the expression of their downstream genes(RNA-Seq data, for orange-stage fruit stage ) in the 397 accessions checked PAV patterns of these promoters, as well as those in the reference genome
  • 100. RESULT - Of these promoters, 240 had downstream genes with significantly different expression Human selection influenced fruit quality or additional phenotypes in some instances by targeting regulatory sequences A rare promoter allele that modifies fruit flavor Pan-genome analysis ~4-kb substitution in the promoter region of TomLoxC (Solyc01g006540) Encodes a 13-lipoxygenase essential for C5 and C6 green-leaf volatile production in tomato fruit 4,151-bp nonreference allele of the TomLoxC promoter captured in Pan-genome Rare allele in cultivated tomatoes that reflects strong negative selection during domestication.
  • 101. Involvement of TomLoxC in apocarotenoid biosynthesis confirmation, ✔QTL mapping TomLoxC as the cause of changed levels of flavor- associated lipid- and carotenoid-derived volatiles. ✔Analysis of transgenic tomato fruit (TomLoxC expression was repressed) revealed a previously unknown alternative apocarotenoid production route. ❑ The tomato pan-genome harbors useful genetic variation which was unvisible on the ‘Heinz 1706’ reference genome alone. ❑ Tomato pan-genome revealed extensive domestication- and improvement-associated loci and genes, with an evident bias toward those involved in defense response
  • 102. On average, each of the 20 genotypes contained 2.9 Mb of single-copy sequence not present in any other assembly Procedure To test the suitability of the single-copy pan-genome for genetic analysis add if time To test the suitability of the single-copy pan-genome- The abundance of 160,716 single-copy clusters that overlap structural variants was estimated by counting cluster-constituent k-mers (k = 31) in sequence reads of the diversity panel
  • 103. ‱ Local PCA and haplotype analysis in our panel of 200 domesticated and 100 wild varieties of barley indicated a single origin of the inverted haplotype. ‱ The inversion occurred only among domesticated barley of Western geographical origin, indicating that it arose or has risen to high frequency after domestication. The inverted region contains high-confidence genes in the Morex cultivar. The closest gene to the inversion breakpoint—at 448 kb distance from the distal breakpoint in the non-carrier Morex—was HvCENTRORADIALIS (HvCEN) ‱ Although induced mutants of HvCEN flower very early, natural variation in HvCEN has previously been implicated in environmental adaptation to northern European climates. ‱ All of the inversion carriers we analysed had HvCEN haplotype III, which is associated with later flowering in spring barley varieties from northern Europe
  • 104. Neighbor-joining tree of 271 diverse rice accessions belonging to three different cultivated and wild rice species viz. O. sativa, O. nivara and O. rufipogon RPGA-based SNP genotyping for efficiently decoding the natural allelic diversity and population genetic structure in order to understand the domestication pattern in rice genepool.
  • 105. ✔Indian traditional Basmati accessions were found to cluster distinctly from aromatic rice accessions belonging to both north-eastern India and other parts of the world. ✔ Traditional Basmati which displayed a closer genetic relationship with japonica and aus accessions Evolved Basmati Traditional Indian basmathi X indica variety (IND 1) ✔ Identified 2 sub-groups within indica subpopulation, INDI corresponding to Xian/Indica-2 (XI-2) South Asia INDII corresponding to XI-3 from and Southeast Asia, previously reported along with two other indica subpopulation groups (XI-1A from East Asia, XI-1B of modern varieties of diverse origins) RESULTS
  • 106. High-resolution QTL mapping conducted using the RPGA-based ultra- high-density 535 genetic linkage map ( Sonasal × PB 1121 RILs) The RPGA-based GWAS detected many previously known major grain size/weight genes like GS3 and PGL1 (grain length and length-to-width ratio) and GW5 (grain width) validates the ability of pan-genome-based GWAS to detect true associations WDR12 gene (candidate gene regulating grain length )underlying the qLWR7 QTL,- validated by both RPGA based GWAS and QTL mapping, is known to and thus appears to be a promising