User friendly tools for the Arabidopsis thaliana 1001 Genomes
1. User-friendly web tools for the
Arabidopsis thaliana 1001
genomes
Beth Rowan
Max Planck Institute for Developmental Biology
Plant and Animal Genomes XXIV
January 11, 2016
2. Why sequence 1001 Arabidopsis thaliana genomes?
Brief history of variant discovery
3. Why sequence 1001 Arabidopsis thaliana genomes?
Brief history of variant discovery
SNPs
107
105
103
1995 2000 2005 2010 2015
First reference genome
4. Why sequence 1001 Arabidopsis thaliana genomes?
Brief history of variant discovery
SNPs
107
105
103
1995 2000 2005 2010 2015
First reference genome
Haplotype map with 20 strains
5. Why sequence 1001 Arabidopsis thaliana genomes?
Brief history of variant discovery
SNPs
107
105
103
1995 2000 2005 2010 2015
First reference genome
Haplotype map with 20 strains
2 wild strains resequenced
6. Why sequence 1001 Arabidopsis thaliana genomes?
Goals
-understand genome
variation in the species
-reconstruct demographic
history
-identify geographic and
genetic subsets
-generate a powerful
resource for genome-wide
association studies
Brief history of variant discovery
SNPs
107
105
103
1995 2000 2005 2010 2015
First reference genome
Haplotype map with 20 strains
2 wild strains resequenced
7. Why sequence 1001 Arabidopsis thaliana genomes?
Goals
-understand genome
variation in the species
-reconstruct demographic
history
-identify geographic and
genetic subsets
-generate a powerful
resource for genome-wide
association studies
Brief history of variant discovery
SNPs
107
105
103
1995 2000 2005 2010 2015
First reference genome
Haplotype map with 20 strains
2 wild strains resequenced
>80 wild strains resequenced
8. Why sequence 1001 Arabidopsis thaliana genomes?
Goals
-understand genome
variation in the species
-reconstruct demographic
history
-identify geographic and
genetic subsets
-generate a powerful
resource for genome-wide
association studies
Brief history of variant discovery
SNPs
107
105
103
1995 2000 2005 2010 2015
First reference genome
Haplotype map with 20 strains
2 wild strains resequenced
>80 wild strains resequenced
1135 wild strains resequenced
32. Integrating tools with Araport
JBrowse
Left click on
variant to see
annotation and
accession
information
33. Integrating tools with Araport
Future plans
1. Get all SNPs in region
2. Get all indels in region
3. Get SnpEff info for given SNP
4. Get VCF subset for given region
5. Get pseudogenomes
6. Helper function: Translate gene
id to coordinates
7. Get allele frequencies for variants
8. Identify allele/haplotype groups
9. Find ADMIXTURE cluster
membership
10. Experimental design tool for
subsetting 1001 collection
examples:
-subset with greates genetic diversity
-accessions with similar climates but from
different geographical areas
-accessions with different population
histories
34. Integrating tools with Araport
Future plans
1. Get all SNPs in region
2. Get all indels in region
3. Get SnpEff info for given SNP
4. Get VCF subset for given region
5. Get pseudogenomes
6. Helper function: Translate gene
id to coordinates
7. Get allele frequencies for variants
8. Identify allele/haplotype groups
9. Find ADMIXTURE cluster
membership
10. Experimental design tool for
subsetting 1001 collection
examples:
-subset with greates genetic diversity
-accessions with similar climates but from
different geographical areas
-accessions with different population
histories
https://www.surveymonkey.com/r/8DTCVQF
35. Acknowledgements
Joffrey Fitz
1001 Genomes Consortium1001 Genomes Consortium
Web ToolsWeb Tools
Project coordinators
Detlef Weigel Magnus Nordborg
MPI for Developmental
Biology
Gregor Mendel Institute
Joy Bergelson, University of Chicago
Joe R. Ecker, Salk Institute
Mitchell Sudkamp, Monsanto
Database creation
Congmao Wang, Zhejiang Acad. of Agri. Sciences
Alexander Platzer, Gregor Mendel Institute
+All Consortium Contributors
Ümit Seren