Closing the gap – linking
collection data to applied
research
Klaus Riede
Alexander Koenig Zoological Research Institute
and Museum of Zoology
&
Center for Development Research
(ZEF), Bonn Germany
“Integrated species
information systems will
allow data mining that
cannot be imagined today“
(EDWARDS et al. 2000)
Anthemideae
Data mining: Insect - plant relations
Step 1: Secondary compounds (INFOBOT)
Data mining:
Insect - plant relations
• Step 1: Identification of Anthemidae - Compositae
BIOLOG - INFOBOT/Bohlmann Files
• Step 2: Distribution of host plants: Anthemidae BIOLOG - INFOBOT
• Step 3: Distribution of phytophagous insects (e.g. grasshoppers,
butterflies)
BIOLOG - SYSTAX - EDIS-DORSA/GART/INGE
• Step 4: Comparisons of distribution by
– Visualising maps (GIS)
– Geostatistics
Applications of data mining
• agriculture
• conservation
• ecology
...but how far have we got?
• taxo-reference (determination)
• geo-reference (linking to GIS)
• Integration (with „applied“ databases)
Taxo-reference:
The taxonomic impediment
How many species are
there?
• 30 million (Erwin1983) ?
• ...but definitely 5 million (Gaston 1991)
• 3 times more than taxonomically described!
Where are they?
TYPES OF TOMORROW
Cerace diehlii
Heteroc. Sumatr. 12(3): 155-161
Chironomus sp. WOC
Eneoptera sp.
Gene sequences for descriptionGene sequences for description
Interdisciplinary approach to Phylogeny of Geometrid Moths:
Combined character sets from two BIOLOG projects.
Peribatodes rhomboidaria
Idaea filicata
Idaea distinctaria
Idaea seriata
Scopula rubiginata
Scopula imitaria
Scopula vigilata
Glossotrophia alba
Scopula marginepunctata
Xanthorhoe vidanoi
Epirrhoe alternata
Cyclophora puppillaria
98
Timandra comae Finland
Idaea degeneraria
Idaea aversata
Xanthorhoe ferrugata
Chloroclystis v-ata
93
100
79
63
Timandra griseata
Timandra comae Bavaria93
68
52
Peribatodes rhomboidaria
Idaea filicata
Idaea distinctaria
Idaea seriata
Scopula rubiginata
Scopula imitaria
Scopula vigilata
Glossotrophia alba
Scopula marginepunctata
Xanthorhoe vidanoi
Epirrhoe alternata
Cyclophora puppillaria
Timandra comae Finland
Idaea degeneraria
Idaea aversata
Xanthorhoe ferrugata
Chloroclystis v-ata
Timandra griseata
Timandra comae Bavaria
Peribatodes rhomboidaria
Idaea filicata
Idaea distinctaria
Idaea seriata
Scopula rubiginata
Scopula imitaria
Scopula vigilata
Glossotrophia alba
Scopula marginepunctata
Xanthorhoe vidanoi
Epirrhoe alternata
Cyclophora puppillaria
Timandra comae Finland
Idaea degeneraria
Idaea aversata
Xanthorhoe ferrugata
Chloroclystis v-ata
Timandra griseata
Timandra comae Bavaria
Morphological tree
(from: Abraham et al 2001,
modified and refined by
INGE project)
DNA-based tree
(from: Miller, Hausmann,
Trusch, 2001)
INGE
CONSENSUS
morphological,
distributional,
and ecological
data
molecular data
• Identification of larvae/immatures
• Identification by non-taxonomists
• Examples from Lepidoptera:
– Kl. Frostspanner, Winter Moth (Operophtera brumata)
– Gr. Frostspanner, Mottled Umber (Erannis defoliaria)
– Roßkastanienspanner, March Moth (Alsophila aescularia)
– Kiefernspanner, Bordered White (Bupalus piniaria)
DNA sequencing as
Reliable Assessment Tool
-
Applications in:
Agricultural Entomology (EDIS/INGE; EDIS/DNA-TAX)
Forest Entomology (EDIS/OBIF)
Where are the extinct species?
• 17,500 species becoming extinct per year (estimate: Wilson
1988)
• Could you name one?
• IUCN - International Red List 2000 (Hilton-Taylor 2000):
• Insecta:
– 73 EXTINCT
– 40 DATA DEFICIENT
Improving our estimate by:
- investigating destroyed habitats
- extrapolate from botanical data (endangered/extinct
plants)
Insecta are not covered by IUCN ....
...but there are well-
monitored flagship species
Species fact sheet from:
EDIS/GART
Specimen data and Conservation
BIOLOG Databases
– taxonomic databases
– specimen data (history)
– Rapid assessment
(EDIS/ABIS, DORSA)
– Reliable assessment
(EDIS/DNA-TAX, INGE)
– keys (EDIS/OBIF, INGE)
-
CONSERVATION Databases
– monitoring threat status
(IUCN - www.redlist.org;
BirdLife International - IBAs)
– population data/ time series
(UNEP-WCMC - marine turtles;
Wetlands International - waterbirds
Ringing databases (birds, bats)
– Environmental data
(WWF; UNEP-WCMC; CI;
CIESIN; GRID-Arendal)
Combining Museum and Monitoring Data by
Mapping
Nesting beaches:
Marine
turtles/Indopacific
World Conservation
Monitoring Centre
(WCMC), 1999
Museum data:
adapted from Iverson
1992
Compilation:
Global Register of
Migratory Species
(GROMS); 2001
Species distribution maps require various
sources
• Museum specimen data (points)
• Monitoring data (selected sites)
• Generalized data (Expected Area, models)
• Raster data
The OpenGIS concept
• Web-based interchange of GIS
data and protocols
• OpenSource software
• OpenGIS mapserver and
protocol are developed for
SYSTAX by EXSE, Dept of
Geoinformatics, Bonn
University
Geo-referencing
the bottleneck for specimen databases
One collector:
3 sites
3,000 specimens
The solution:
Geo-reference the
collector

Closing the gap – linking collection data to applied research

  • 1.
    Closing the gap– linking collection data to applied research Klaus Riede Alexander Koenig Zoological Research Institute and Museum of Zoology & Center for Development Research (ZEF), Bonn Germany
  • 2.
    “Integrated species information systemswill allow data mining that cannot be imagined today“ (EDWARDS et al. 2000)
  • 3.
    Anthemideae Data mining: Insect- plant relations Step 1: Secondary compounds (INFOBOT)
  • 4.
    Data mining: Insect -plant relations • Step 1: Identification of Anthemidae - Compositae BIOLOG - INFOBOT/Bohlmann Files • Step 2: Distribution of host plants: Anthemidae BIOLOG - INFOBOT • Step 3: Distribution of phytophagous insects (e.g. grasshoppers, butterflies) BIOLOG - SYSTAX - EDIS-DORSA/GART/INGE • Step 4: Comparisons of distribution by – Visualising maps (GIS) – Geostatistics
  • 5.
    Applications of datamining • agriculture • conservation • ecology ...but how far have we got? • taxo-reference (determination) • geo-reference (linking to GIS) • Integration (with „applied“ databases)
  • 6.
    Taxo-reference: The taxonomic impediment Howmany species are there? • 30 million (Erwin1983) ? • ...but definitely 5 million (Gaston 1991) • 3 times more than taxonomically described! Where are they?
  • 7.
    TYPES OF TOMORROW Ceracediehlii Heteroc. Sumatr. 12(3): 155-161 Chironomus sp. WOC Eneoptera sp.
  • 8.
    Gene sequences fordescriptionGene sequences for description Interdisciplinary approach to Phylogeny of Geometrid Moths: Combined character sets from two BIOLOG projects. Peribatodes rhomboidaria Idaea filicata Idaea distinctaria Idaea seriata Scopula rubiginata Scopula imitaria Scopula vigilata Glossotrophia alba Scopula marginepunctata Xanthorhoe vidanoi Epirrhoe alternata Cyclophora puppillaria 98 Timandra comae Finland Idaea degeneraria Idaea aversata Xanthorhoe ferrugata Chloroclystis v-ata 93 100 79 63 Timandra griseata Timandra comae Bavaria93 68 52 Peribatodes rhomboidaria Idaea filicata Idaea distinctaria Idaea seriata Scopula rubiginata Scopula imitaria Scopula vigilata Glossotrophia alba Scopula marginepunctata Xanthorhoe vidanoi Epirrhoe alternata Cyclophora puppillaria Timandra comae Finland Idaea degeneraria Idaea aversata Xanthorhoe ferrugata Chloroclystis v-ata Timandra griseata Timandra comae Bavaria Peribatodes rhomboidaria Idaea filicata Idaea distinctaria Idaea seriata Scopula rubiginata Scopula imitaria Scopula vigilata Glossotrophia alba Scopula marginepunctata Xanthorhoe vidanoi Epirrhoe alternata Cyclophora puppillaria Timandra comae Finland Idaea degeneraria Idaea aversata Xanthorhoe ferrugata Chloroclystis v-ata Timandra griseata Timandra comae Bavaria Morphological tree (from: Abraham et al 2001, modified and refined by INGE project) DNA-based tree (from: Miller, Hausmann, Trusch, 2001) INGE CONSENSUS morphological, distributional, and ecological data molecular data
  • 9.
    • Identification oflarvae/immatures • Identification by non-taxonomists • Examples from Lepidoptera: – Kl. Frostspanner, Winter Moth (Operophtera brumata) – Gr. Frostspanner, Mottled Umber (Erannis defoliaria) – Roßkastanienspanner, March Moth (Alsophila aescularia) – Kiefernspanner, Bordered White (Bupalus piniaria) DNA sequencing as Reliable Assessment Tool - Applications in: Agricultural Entomology (EDIS/INGE; EDIS/DNA-TAX) Forest Entomology (EDIS/OBIF)
  • 10.
    Where are theextinct species? • 17,500 species becoming extinct per year (estimate: Wilson 1988) • Could you name one? • IUCN - International Red List 2000 (Hilton-Taylor 2000): • Insecta: – 73 EXTINCT – 40 DATA DEFICIENT Improving our estimate by: - investigating destroyed habitats - extrapolate from botanical data (endangered/extinct plants)
  • 11.
    Insecta are notcovered by IUCN .... ...but there are well- monitored flagship species Species fact sheet from: EDIS/GART
  • 12.
    Specimen data andConservation BIOLOG Databases – taxonomic databases – specimen data (history) – Rapid assessment (EDIS/ABIS, DORSA) – Reliable assessment (EDIS/DNA-TAX, INGE) – keys (EDIS/OBIF, INGE) - CONSERVATION Databases – monitoring threat status (IUCN - www.redlist.org; BirdLife International - IBAs) – population data/ time series (UNEP-WCMC - marine turtles; Wetlands International - waterbirds Ringing databases (birds, bats) – Environmental data (WWF; UNEP-WCMC; CI; CIESIN; GRID-Arendal)
  • 13.
    Combining Museum andMonitoring Data by Mapping Nesting beaches: Marine turtles/Indopacific World Conservation Monitoring Centre (WCMC), 1999 Museum data: adapted from Iverson 1992 Compilation: Global Register of Migratory Species (GROMS); 2001
  • 14.
    Species distribution mapsrequire various sources • Museum specimen data (points) • Monitoring data (selected sites) • Generalized data (Expected Area, models) • Raster data
  • 15.
    The OpenGIS concept •Web-based interchange of GIS data and protocols • OpenSource software • OpenGIS mapserver and protocol are developed for SYSTAX by EXSE, Dept of Geoinformatics, Bonn University
  • 16.
    Geo-referencing the bottleneck forspecimen databases One collector: 3 sites 3,000 specimens The solution: Geo-reference the collector