2. Overview
Databases and Websites – the infrastructure to deal with the
data deluge and make the resources useful
Hardwood Genomics Content
Hardwood Genomics Tools
Tripal – the database software
4. Software tools to find and explore information
Organism Page mRNA Page Genome Page
BLAST
Search
Keyword
Search
Symap
Search
Pages with information to explore and download
8. Genes -> mRNAs transcripts
• Genes – the functional units
of the genome
• Can produce more than one
type of transcript
• (Isn’t biology cool!?!)
• Chinese Chestnut genome has
about
• 36,478 genes
• 38,146 transcripts
12. Chestnut Genome
• Chestnut has an
estimated 800,000,000
bases of DNA
• We have sequenced the
chestnut genome and
placed it into 41,260
pieces covering 724Mb
• Equivalent to a 500,000
page book
• Not recommended for
light reading!
• Nathaniel Cannon’s work
is putting these pieces in
order
41,260
13. Ways to access the Chestnut
genome - Download
• Download the raw sequence files:
41,260 of these
18. Ways to access the Chestnut
genome
• This 500,000 page file isn’t too useful…
• What does it do? Where are the genes???
I could really use a map!!!
29. Other ways to find what you
need and cool stuff to do
• Searching
• Search by gene name
• Search by function
• “Cytochrome P450”
• “Mitogen-activated
protein kinase”
• Gene annotation
• Think we did a bad
job for the “map” of
this gene? You can go
fix it!
30. Other ways to find what you
need and cool stuff to do
Symap
• Compare the
structure of the
chestnut
genome to
other trees
32. A web framework for genetic and genomic data
Goals:
Simplify construction of a community genomics
websites
Enable individual labs or research communities
Encourage high-quality, standards-based
websites for data sharing and collaboration
Expand and reuse code
33.
34.
35. Why use Tripal?
• Open source
• Friendly developers
• Responsive mailing list
• Much of the stuff you need for a website is already there
Modules:
• Organisms
• Genomes
• Transcriptomes
• Stocks/Germplasm
• Phenotypes
• Genotypes
36. NSF DIBBS Grant
• By leveraging the needs across many plant genomic communities,
we can make a strong case for federal support
• Funded development helps everyone!
• DIBBS: Integrate Tripal with Galaxy, an open source, web-based
platform for data intensive biomedical research.
Stephen Ficklin
37. Jack Davitt Nathan Henry Ming Chen
Former Research Associate Research Associate Graduate Student
Editor's Notes
Developing High-Throughput Bioinformatic Pipelines for Cross-Disciplinary Research at an Ag University: Case Studies in Dog, Chicken, and Hardwood Trees
Overview
Perspectives from a core facility
Examples
GWAS
Whole genome resequencing
RNA sequencing
Training
Research
Chestnut blight resistance
Comparative genomics of hardwood trees