Trait data mining at European pre-breeding workshop at Alnarp (25 Nov 2009)


Published on

Trait mining with eco-geographic data for improved utilization of plant genetic resources. Presentation for the cereal pre-breeding workshop at Alnarp. A brief overview of the new trait mining method: Focused Identification of Germplasm Strategy (FIGS). And many thanks to Michael Mackay and Ken Street for providing some of the slides!

Endresen, D.T.F. (2010). Predictive association between trait data and ecogeographic data for Nordic barley landraces. Crop Sci. 50(6):2418-2430. doi: 10.2135/cropsci2010.03.0174

Published in: Technology
1 Comment
1 Like
  • I did two small changes after loading the first version. First thanks to Dirk for spotting me naming the South Australia state as Queensland (slide 19) - I am amazed that you spotted this during the few hours I uploaded the slides here before the presentation - and in time for me to correct the slide I presented! Thanks! And also thanks to Isaak for spotting that I had marked the wrong Priekuli in Latvia (slide 27) on the Google Map illustration map. Now the red marker for Priekuli is corrected in the slides here.
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Trait data mining at European pre-breeding workshop at Alnarp (25 Nov 2009)

  1. 1. •  U#liza#on  of  gene#c  diversity   •  Core  collec#on  subset   •  Trait  mining  selec#on  (FIGS)   •  Computer  modeling   •  Some  examples  (FIGS)   2  
  2. 2.   wild  tomato   tomato   teosinte   corn,  maize   3  
  3. 3. B   B   A   C   A   A   A   A   A   Crop  Wild  Rela#ves   Tradi#onal  landraces   Modern  cul#vars   Gene/c  bo1lenecks  during  crop  domes/ca/on  and  during  modern  plant  breeding.   The  circles  represent  allelic  varia#on.  The  funnels  represents  allelic  varia#on  of  genes   found  in  the  crop  wild  rela#ves,  but  gradually  lost  during  domes#ca#on,  tradi#onal   cul#va#on  and  modern  plant  breeding.   4  
  4. 4. 5  
  5. 5. •  Scien#sts  and  plant  breeders  want  a   few  hundred  germplasm  accessions   to  evaluate  for  a  par#cular  trait.   •  How  does  the  scien#st  select  a  small   subset  likely  to  have  the  useful  trait?   •  Example:  More  than  560  000  wheat   accessions  in  genebanks  worldwide.   Slide  adopted  from  a  slide  by  Ken  Street,  ICARDA  (FIGS  team)   6  
  6. 6. •  The  scien#st  or  the  breeder   need  a  smaller  subset  to  cope   with  the  field    screening   experiments.   •  A  common  approach  is  to   create  a  so-­‐called  core   collec/on.   Sir  OYo  H.  Frankel  (1900-­‐1998)   proposed  a  limited  set   established  from  an  exis#ng   collec#on  with   between  its  entries.   The  core  collec#on  is  of  limited   size  and  chosen  to    of  a  large   7   collec#on  (1984)  .  
  7. 7. •  Given  that  the  trait   property  you  are  looking   for  is  rela#vely  rare:   •  Perhaps  as  rare  as  a   unique  allele  for  one   single  landrace  cul#var...   •  Geang  what  you  want  is   largely  a  ques#on  of   LUCK!   8   Slide  adopted  from  a  slide  by  Ken  Street,  ICARDA  (FIGS  team)  
  8. 8. 9  
  9. 9.  Objec/ve  of  this  study:     –  Explore  climate  data  as  a   predic#on  model  for  “computer   pre-­‐screening”  of  crop  traits   BEFORE  full  scale  field  trials.   –  Iden#fica#on  of  landraces  with  a   higher  probability  of  holding  an   interes#ng  trait  property.   10  
  10. 10. Wild  rela#ves  are  shaped     Primi#ve  cul#vated  crops   Tradi#onal  cul#vated  crops   by  the  environment   are  shaped  by  local   (landraces)  are  shaped  by   climate  and  humans   climate  and  humans   Modern  cul#vated  crops  are   Perhaps  future  crops  are   mostly  shaped  by  humans   shaped  in  the  molecular   (plant  breeders)   laboratory…?   11  
  11. 11. •  Primi#ve  crops  and  tradi#onal  landraces   are  an  important  source  for  novel  traits   for  improvement  of  modern  crops.   •  Landraces  are  ohen  not  well  described  for   the  economically  valuable  traits.   •  Iden#fica#on  of  novel  crop  traits  will  ohen   be  the  result  of  a  larger  field  trial   screening  project  (thousands  of  individual   plants).   •  Large  scale  field  trials  are  very  costly,  area   and  human  working  hours.   12  
  12. 12.  Assump/on:  the  climate  at  the   original  source  loca#on,  where   the  landrace  was  developed   during  long-­‐term  tradi#onal   cul#va#on,  is  correlated  to  the   trait  score.      Aim:  to  build  a  computer   model  explaining  the  crop  trait   score  (dependent  variables)  from   the  climate  data  (independent   variables).   13  
  13. 13. 1)  Landrace  samples  (genebank  seed  accessions)   2)  Trait  observa#ons  (experimental  design)  -­‐  High  cost  data   3)  Climate  data  (for  the  landrace  loca#on  of  origin)  -­‐  Low  cost  data   •   The  accession  iden#fier  (accession  number)  provides  the  bridge  to  the  crop  trait  observa#ons.   •   The  longitude,  la/tude  coordinates  for  the  original  collec#ng  site  of  the  accessions  (landraces)  provide  the   bridge  to  the  environmental  data.     14  
  14. 14. Alnarp,  Sweden   Lima,  Peru   Svalbard   Benin   15  
  15. 15. Faba  bean,  Finland   Field  trials,  Gatersleben,  Germany   Potato  Priekuli  Latvia   Forage  crops,  Dotnuva,  Lithuania   Radish  (S.  Jeppson)   Linnés  äpple   16   Powdery  Mildew,     Leaf  spots   Yellow  rust   Black  stem  rust   Blumeria  graminis   Ascochyta  sp.   Puccinia  strilformis   Puccinia  graminis   hYp://barley.ipk-­‐    
  16. 16.  The  climate  data  is  extracted  from   the  WorldClim  dataset.    hYp://      Data  from  weather  sta#ons   worldwide  are  combined    to  a   con#nuous  surface  layer.    Climate  data  for  each  landrace  is   Precipita#on:  20  590  sta#ons   extracted  from  this  surface  layer.   Temperature:  7  280  sta#ons   17  
  17. 17. FIGS  selec#on  is  a   new  method  to   predict  crop  traits  of   primi#ve  cul#vated   material  from   climate  variables  by   using  mul#variate   sta#s#cal  methods.     18  
  18. 18. What is hYp://     Mediterranean  region   Origin of Concept (1980s): Wheat and barley landraces from South  Australia   marine soils in the Mediterranean region provided genetic variation Slide made by for boron toxicity. Michael Mackay 1995 19  
  19. 19. FIGS    The  FIGS  technology  takes  much  of  the  guess   work  out  of  choosing  which  accessions  are  most   likely  to  contain  the  specific  characteris#cs  being   sought  by  plant  breeders  to  improve  plant   produc#vity  across  numerous  challenging   environments.    hYp://         20   20  
  20. 20. Slide made by Michael Mackay 1995 21  
  21. 21. 22  
  22. 22. –  For  the  ini#al  calibra#on  or  training   step.   –  Further  calibra#on,  tuning  step   –  Ohen  cross-­‐valida#on  on  the  training   set  is  used  to  reduce  the  consump#on   of  raw  data.   –  For  the  model  valida#on  or  goodness  of   fit  tes#ng.   –  New  external  data,  not  used  in  the   model  calibra#on.   23  
  23. 23. –  No  model  can  ever  be  absolutely  correct   –  A  simula#on  model  can  only  be  an   approxima#on   –  A  model  is  always  created  for  a  specific   purpose   –  The  simula#on  model  is  applied  to  make   predic#ons  based  on  new  fresh  data   –  Be  aware  to  avoid  extrapola#on  problems   24  
  24. 24. 25  
  25. 25. •  No  sources  of  Sunn  pest  resistance   previously  found  in  hexaploid  wheat.   •  2  000  accessions  screened  at  ICARDA   without  result  (during  last  7  years).   •  A  FIGS  set  of  534  accessions  was   developed  and  screened  (2007,  2008).     •  10  resistant  accessions  were  found!   •  The  FIGS  selec#on  started  from  16  000  landraces  from   VIR,  ICARDA  and  AWCC   •  Exclude  origin  CHN,  PAK,  IND  were  Sunn  pest  only   recently  reported  (6  328  acc).   •  Only  accession  per  collec#ng  site  (2  830  acc).   •  Excluding  dry  environments  below  280  mm/year   •  Excluding  sites  of  low  winter  temperature  below  10   degrees  Celsius  (1  502  acc)   hYp://­‐009-­‐9427-­‐1     Slide  adopted  from  Ken  Street,  ICARDA  (FIGS  team)   26  
  26. 26. 27   Priekuli  (L)   Bjorke  (N)   Landskrona  (S)  
  27. 27. Heading   Ripening   Length   H-­‐Index   Vol  wgt   TGW   Priekuli  (L)   Bjorke  (N)   Landskrona  (S)     28  
  28. 28. Michael  Mackay   FIGS  coordinator   •  Barley (Hordeum vulgare ssp. vulgare) collected Ken  Street   FIGS  project  leader   from different countries worldwide screened for susceptibility of net blotch infection (1676 greenhouse + 2975 field observations). •  Net blotch is a common disease of barley caused by Harold  Bockelman   the fungus Pyrenophora teres.   Net  blotch  data   •  Screened at four USDA research stations: North Dakota (Langdon, Fargo), Minnesota (Stephen), Georgia (Athens). Eddy  De  Pauw   Climate  data   •  1-3 are basically resistant  group 1 •  4-6 are intermediate  group 2 •  7-9 are susceptible  group 3 •  Discriminant analysis (DA): Dag  Endresen   Data  analysis   •  Correctly classified groups: 45.9% in the training set and 44.4% in the test set. •  Work in progress! (SIMCA, D-PLS) 29  
  29. 29. 30