Proteome bioinformatics and genetics for associating proteins with grain phenotype
Upcoming SlideShare
Loading in...5
×
 

Proteome bioinformatics and genetics for associating proteins with grain phenotype

on

  • 883 views

International Gluten Workshop, 11th; Beijing (China); 12-15 Aug 2012

International Gluten Workshop, 11th; Beijing (China); 12-15 Aug 2012

Statistics

Views

Total Views
883
Views on SlideShare
656
Embed Views
227

Actions

Likes
0
Downloads
3
Comments
0

1 Embed 227

http://conferences.cimmyt.org 227

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Proteome bioinformatics and genetics for associating proteins with grain phenotype Proteome bioinformatics and genetics for associating proteins with grain phenotype Presentation Transcript

  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeRudi Appels,Centre for Comparative Genomics, Murdoch University, AustraliaPaula Moolhuijzen, Brett Chapman, Wujun Ma, Dean Diepeveen, MatthewBellgard,Centre for Comparative Genomics, Murdoch University and Departmentof Food and Agriculture WA, Australia.Yueming Yan, Shunli Wang,Capital Normal University, BeijingAngela Juhasz,Agricultural Institute, Martonvá r, Hungary sáFrank Bekes,FBFD Pty Ltd, Beecroft, Sydney, Australia 2119 CENTRE FOR COMPARATIVE GENOMICS
  • Centre for Comparative Genomics (CCG) at Murdoch University Supercomputer • Stage 1A Pawsey Centre (SKA) • Ranked 87 in the world • 9600 cores CENTRE FOR COMPARATIVE GENOMICS
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeRudi Appels,Centre for Comparative Genomics, Murdoch University, AustraliaPaula Moolhuijzen, Brett Chapman, Wujun Ma, Dean Diepeveen, MatthewBellgard,Centre for Comparative Genomics, Murdoch University and Departmentof Food and Agriculture WA, Australia.Yueming Yan, Shunli Wang,Capital Normal University, BeijingAngela Juhasz,Agricultural Institute, Martonvá r, Hungary sáFrank Bekes,FBFD Pty Ltd, Beecroft, Sydney, Australia 2119 CENTRE FOR COMPARATIVE GENOMICS
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype• Genome sequencing and high resolution genetic maps of wheat• Integrating new wheat protein level analyses• Translating research findings to industry – the Decision Matrix
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype• The integration of new efforts to obtain reference sequences for breadwheat and barley genomes is accelerating gene discovery.• Locations of traits and proteins on DNA sequence assemblies viagenetic maps define gene networks•The genomic resources are refining molecular marker development andmapping strategies for combining yield with quality attributes of thegrain that meet markets requirements
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype• Genome sequencing and high resolution genetic maps of wheat• Integrating new wheat protein level analyses• Translating research findings to industry – the Decision Matrix
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype Locations of proteins within a genetic map can be determined One of the first examples was published by Amiour (2003) using 2D gels to identify chromosomal locations of amphiphilic proteins from wheat grains . Later Chen et al (2007) carried out mapping using MALDI-TOF defined peaks of gliadin Progress in the DNA sequencing of the wheat transcribed genes and now allows higher resolution maps to be establishedAmiour N, et al (2003) Theor. Appl. Genet. 108: 62–72. .Chen J, et al (2007) Rapid Comm Mass Spectrometry 21: 2913 – 2917
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype2007 – 2012Suites of genomic resources and knowledge have been established to providethe foundation for sequencing the wheat and barley• International Wheat Genome Sequencing Consortium (www.wheatgenome.org)• UK WISP consortium (www.wheatisp.org)• International Barley Sequencing Consortium (www.barleygenome.org)• European TriticeaeGenome FP7 project (www.triticeaegenome.eu)The initiatives built on long standing resources such as:• KOMUGI in Japan (www.shigen.nig.ac.jp/wheat/komugi/)• Graingenes in the USA (wheat.pw.usda.gov/GG2/index.shtml)• Extensive EST collections (ITEC http://avena.pw.usda.gov/genome/)
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype • Reducing the complexity of the wheat genome through flow sorting of chromosome arms has formed the basis for the international effort to produce a reference sequence for the variety Chinese Spring • All the chromosome arms now have a completed survey sequence analysis. This provides a pool of DNA contigs that can be used to anchor gene sequences and proteins to chromosome arms
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype The array technologies to assay single nucleotide polymorphisms (SNPs) is now establishing genetic maps with 2000-3000 molecular markers . map for chromosomes 1A, 1B, 1D, from a cross, Avalon x CadenzaAllen AM, Barker GLA, Berry ST, Coghill, JA, Gwilliam R, Kirby S, Robinson P, Brenchley RC, D’Amore R,McKenzie N, Waite D, Hall A, Bevan M, Neil Hall N, Edwards KJ. (2011)Transcript-specific, single-nucleotidepolymorphism discovery and linkage analysis in hexaploid bread wheat (Triticum aestivum L.). Plant BiotechnologyJournal 2011: 1–14
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype Chromosome 7AThe 9000 SNP array (“chip”) technology for assayingSNPs has been used to establish a 2000 molecularmarker map for a set of 225 double haploid lines from aWestonia x Kauz cross.A large study in Australia is examining progeny from acomplex cross (MAGIC, currently a 4 –way cross usingBaxter, Yitpi, Westonia, Chara, 1500 lines, with markersfrom a 9K SNP chip and markers from a 90K chipplanned). This work at CSIRO with Colin Cavanagh.An 8 –way cross using Baxter, Yitpi, Westonia, ACBarrie (Canada), Alsen (US), Pastor (CIMMYT),Xiaoyan 54 (China), and Volcani (Israel), 5000 lines arebeing characterized.
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeIn a large population of 5,000 lines (as required for accurate mapping) it is notfeasible to phenotype all progenyThe marker information can be used to define families of progeny forphenotypingFor the 1500 lines from the 4x MAGIC lines, a population 370 families havebeen defined for phenotyping (in duplicated/randomized designs) and while weare still in the middle of this analysis (includes milling yield), some QTL for %wet gluten at the LMW-glutenin locus of chromosome 1B are evident.It is interesting that in the high resolution maps the QTL may not be exactlysuperimposed on the LMW-glutenin locus.
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype GluStar system for “wet gluten” • MAGIC and measurements assignment of a QTL on 4.5 g flour for % wet gluten to 1B near the LMW glutenin locus but not coincident with it • The high density of markers allows a fine resolution of map location when 1,500 progeny are analyzedTomoshozi S, Budapest University of Technology and Economy; http://www.labintern.hu
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeTo determine protein fingerprints as a “phenotype” we have explored MALDI-TOF as a means for increasing the number of lines we can analyse.Low molecular weight glutenins Li et al (2010). BMC Plant Biology 10:124
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeHigh molecular weight glutenins (70,000– 90,000 Da) Li et al (2009). Cereal Sci. 50: 295-301; Gao L et al (2010). J Ag Food Chem 58: 2777–2786
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeHMW-GS Mr (Da) deduced from coding gene Mr (Da) by MALDI-TOF 1Ax2* 86309 86200 1Bx6 Unknown 86500 1Bx7 82524 823001Bx7OE 83134 82900 1Bx7b* Unknown 82600 1Bx13 Unknown 83000 1Bx14 84012 83600 1Bx17 78607 77900, 78400 1Bx20 Unknown 82100 1Dx2 87022 87000 1Dx3 Unknown 85400 1Dx5 88128 87900 1By8 75156 74900 1By8a* Unknown 74800 1By8b* Unknown 75000 1By9 73515 73300 1By15 75733 74900 1By16 Unknown 76900 1By18 Unknown 75000 1By20 Unknown 74900 1Dy10 67473 67300 Li et al (2009) Cereal 1Dy12 68652 68300 Sci. 50: 295-301;
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeThe MALDI-TOF based analyses of the LMW and HMW glutenins haveprovided a good basis for establishing a high throughput analysis for breedingprograms. This analysis now runs as a fee-for-service (Saturn Biotech;AUS$6/sample).The glutenin subunit protein loci we know to date however can only accountfor approximately 60% of the variation in measured grain quality attributes.More detailed genetic analyses is yielding new information
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeChromosome 1D Map based on DH lines from a L29183 Westonia x Kauz cross L33288 The classic designation of the LMW L33529 glutenin locus Westonia on chromosome 1D is LMWG-D3c (in addition to A3c, B3h). Kauz designation is not known Peaks from: Westonia = L33288 Kauz = L29183, L33529 Peaks found in LMWG-D3c (based on Li et al 2010): 33021 33290 33453 Li et al (2010). BMC Plant Biology 10:124
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeChromosome 1D Map based on DH lines from a L29183 Westonia x Kauz cross L33288 The classic designation of the LMW L33529 glutenin locus Westonia on chromosome 1D is LMWG-D3c (in addition to A3c, B3h). Kauz designation is not known Peaks from: Westonia = L33288 Kauz = L29183, L33529 Peaks found in LMWG-D3c (based on Li et al 2010): 33021 33290 33453 Li et al (2010). BMC Plant Biology 10:124
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeChromosome 7A Classical mapping of LMW-glutenin loci defined the chromosome 1A, 1B and 1D loci based on single dimension SDS PAGE technology (Gupta and Shepherd, 1994) and it was noted then that the protein family was complex. We now find some of the peaks in the MALDI-TOF are mapping to other chromosomes such as chromosome 7A We used our wheat proteome data base to see if we could identify the L32831 and L31965 proteins L32831 L31965 Gupta and Shepherd (1994. Two-step one-dimensional SDS-PAGE analysis of LMW subunits of glutenin. I. Variation and genetic control of the subunits in hexaploid wheats. Theor. Appl. Genet. 80:65-74)
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeChromosome 7A In this analysis we are accessing a complex part of the LMW glutenin protein spectrum that was not available for analysis in the earlier SDS gel-based studies L32831 L31965
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeChromosome 7A In this analysis we are accessing a complex part of the LMW glutenin protein spectrum that was not available for analysis in the earlier SDS gel-based studies L32831 L31965
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype Criteria for database search: (1) Qualitative – amino acid composition (occurrence of QQQ etc) consistent with being co-extracted with LMW-glutenins (gliadins removed before- hand) (2) Quantitative – molecular weight within 10 dalton >Komugi_AJ133603_1 AJ133603 7209247 [Triticum aestivum]Query : L31965 Triticum aestivum mRNA for alpha- gliadin storage protein, clone alpha-9IWGSC_4DS_v1_2275417.fa.genscan.pep.1 31960 MVRVTVPQLQPQNPSQQQPQEQIWGSC_2AL_v1_6356128.fa.genscan.pep.2 31960 VPLVQQQQFLGQQQPFPPQQPYPIWGSC_4BS_v1_4917914.fa.genscan.pep.1 31960 QPQPFPSQQPYLQLQPFPQPQLPIWGSC_1AL_v2_3915175.fa.genscan.pep.1 31960 YSQPQPFRPQQPYPQPQPQYSQPKomugi_ AJ133603_1 31960 QQPISQQQQQQQQQQQQQQQQ QQQQQQQILQQILQQQLIPCMDVIWGSC_3B_v1_10586963.fa.genscan.pep.1 31961 VLQQHNIVHGRSQVLQQSTYQLLIWGSC_5DS_v1_2734070.fa.genscan.pep.1 31961 QELCCQHLWQIPEQSQCQAIHNVIWGSC_2BS_v1_5247743.fa.genscan.pep.3 31961 VHAIILHQQQKQQQQPSSQVSFQ QPLQQYPLGQGSFRPSQQNPQAQ GSVQPQQLPQFEEIRNLALQTLPA MCNVYIPPYCTIAPFGIFGTNYR
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype Criteria for database search: (1) Qualitative – amino acid composition (occurrence of QQQ etc) consistent with being co-extracted with LMW-glutenins (gliadins removed before- hand) (2) Quantitative – molecular weight within 10 daltonQuery : L32831 >Solomon_B2ZRD2_WHEAT B2ZRD2 SubName: Full=Alpha-gliadin; [TriticumIWGSC_4BL_v1_6996674.fa.genscan.pep.4 31980 aestivum (Wheat).] MKTFLILALLAIVATTATTAGRVPVPQL QPQNPSQQQPQEQVPLVQQQQFLGQSolomon_Q8H0J4_WHEAT 31934 QQPFPPQQPYPQPQPFPSQQPYLQLQP FPQPQLPYSQPQPFRPQQPYPQPQPQYSolomon_B2ZRD2_WHEAT 32829 SQPQQPISQQQQQQQQQQQQQQQEQ QILQQILQQQLIPCMDVVLQQHNIAH GRSQVLQQSTYQLLQELCCQHLWQIP EQSQCQAIHNVVHAIILHQQQKQQQQ PSSQFSFQQPLQQYPLGQGSSRPSQQN PQAQGSVQPQQLPQFEEIRNLALQTLP AMCNVYIPPYCTIAPFGIFGTN
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeChromosome 7A This analysis suggests that there are probably more genetic loci for major seed storage proteins than we have found to date. Genome sequencing and proteome analyses, combined with genetic mapping can define these new loci and provide molecular markers for breeding and selection. It turns out that a 1980 report did find LMWG/gliadins on 4B and 7A Salcedo G, Prada J, Sanchez-Monge R, Aragoncillo C (1980). Aneuploid analysis of low L32831 molecular weight gliadins from wheat. Theor L31965 Appl Genet 56 ; 65-69
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeChromosome 7A The “hits” on chromosome 7A will be resolved as we have now started to sequence this chromosome, as a national project in Australia. This is part of the International Wheat Genome Sequencing Consortium (IWGSC) in which different countries around the world are doing a chromosome each. L32831 L31965
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype• Genome sequencing and high resolution genetic maps of wheat• Integrating new wheat protein level analyses• Translating research findings to industry – the Decision Matrix
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeThe Wheat Proteome database:Motivation : wheat genome, transcriptome and proteome studies are now advancedand need a reference proteome database for• annotating the genes in the wheat• assigning peptides, obtained from high level proteomic analyses, to wheat proteinsContent of proteins/peptides:• wheat/Triticum entries from SwissProt, UniProt, TrEMBL (2,690)• translation from the KOMUGI full-length cDNA collection (13,717)• peptides from INRA (France), USDA (USA), CNU (China) labs (still sorting out a final non-redundant set)• IWGSC-genome-wide-sequence (GWS) gene model translations (144,920)
  • Proteome bioinformatics and genetics for associating proteins with grain phenotypeThe Wheat Proteome database:(1) Translations of conserved genes.The IWGSC-GWS database for each chromosome arm typically identifies 4000-9000genic sequences per chromosome. These include gene fragments and pseudogenes.Following their identification, genes conserved between wheat, Brachypodium, rice,sorghum and barley (Klaus Mayer “chromosome zipper”) can be clustered intosyntenic groups.(2) Non-redundant proteins/wheat known to originate from wheat30-40% of the gene complement in wheat and barley do not reside in the conservedsyntenic gene order space All genes and protein/peptide sequences need to be anchored to the IWGSC-GWSchromosome arms DNA sequences. So far only 205 KOMUGI translations and 6 fromthe SwissProt/UniProt/TrEMBL dataset have been anchored to the IWGSC-GWStranslations so there is quite a bit of curation to carry out.
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype• Genome sequencing and high resolution genetic maps of wheat• Integrating new wheat protein level analyses• Translating research findings to industry – the Decision Matrix
  • To complete this presentation it Weights assigned to featuresis important to considertranslating research findings to Featureindustry. Genome Gene Protein Other fingerprint marker marker traits(1) Further stream-lining of the MALDI-TOF scoring of wheat proteins For each breeding line (matrix rows) the(2) Assigning a toxicity score to feature score (matrix specific proteins in considering selection index values columns) is multiplied celiac and wheat allergy by the feature weight. reactions to wheat flour These are then added to provide a selection index (SI)The aim is to be able to enter This SI is used to rankspecific features of the wheat grain breeding lines oras a number into a Decision Matrix suitability for an end- product in industry
  • (1) Further stream-lining of the MALDI-TOF scoring of wheat proteins we are followingthe MALDIquant process described by Sebastian Gibb (IMISE, University of Leipzig) 1: raw 2: variance stabilization 3: smoothing 4: base line correction 5: peak detection 6: peak plot Dean Diepeveen
  • (2) Assigning a toxicity score to specific proteins in considering celiac disease (CD) and wheat allergy (WA) reactions to wheat flour Proof of concept by Angla Juhasz and Frank Bekes carried on the data set published by DuPont et al (2011) Every protein in the wheat grain defined by DuPont et al (2011) was assigned a toxicity score which is the result of the amount of protein in the grain x the number of epitopes present that are known to relate to CD and or WA
  • (2) Assigning a toxicity score to specific proteins in considering celiac disease (CD) and wheat allergy (WA) reactions to wheat flour Proof of concept by Angla Juhasz and Frank Bekes carried on the data set published by DuPont et al (2011) Every protein in the wheat grain defined by DuPont et al (2011) was assigned a toxicity score which is the result of the amount of protein in the grain x the number of epitopes present that are known to relate to CD and or WA
  • Proteome bioinformatics and genetics for associating proteins with grain phenotype• Genome sequencing and high resolution genetic maps of wheat• Integrating new wheat protein level analyses• Translating research findings to industry – the Decision MatrixThe proteins of the wheat grain form a significantphenotype in breeding, industry processing andmarketing, and will become more important indefining the product