Recreating biomes one label at a time

•Download as PPT, PDF•

1 like•499 views

This document discusses efforts to digitize records of Hemiptera insect specimens and their host plants. Over 1 million insect specimens from 124 families have been digitized so far, including records of what plants they were collected on. The author analyzes the data to determine the probability that a hemipteran insect species was collected on a federally endangered or threatened plant in the US. The analysis finds that 19 species have over a 10% probability of being collected on an endangered host plant, and 9 species are only known from endangered hosts. Future work is suggested to expand this type of analysis to more insect groups and host plant data.

Science

Recreating biomes one label at a time
Katja C. Seltmann
American Museum of Natural History
enicospilus@gmail.com

Human-mediated Disturbance
Our great loss in biodiversity that is difficult to
calculate. Many efforts exist to sample a snapshot of
present biodiversity as the world is rapidly changing
from increased human activity.
enicospilus@gmail.com

Historical Ecology
What we grow up with today ends up as the baseline
for our viewpoint of “nature” in the future.
enicospilus@gmail.com

Myzocallis (Myzocallis) castaneae
Hemiptera have declining plant hosts
enicospilus@gmail.com
Reuteria querci
Telamona reclivata
Platycotis vittata
Atymna castaneaeAtymna querci
Ophiderma flava
Cyrtolobus maculifrontis
Archasia auriculata
Myzocallis (Neomyzocallis) punctata

Calculating host specificity
enicospilus@gmail.com

enicospilus@gmail.com
Role of natural history collections

Digitization of Hemiptera
enicospilus@gmail.com
Present Dataset:
Over 1,011,627 Total Specimens
Date Range: 1811 – Present
124 total insect families
Unique collecting events/ insect family

IUCN red list status
enicospilus@gmail.com
0.52% species of insects evaluated

Heteroptera have declining plant hosts
enicospilus@gmail.com
IUCN Red List:
Worldwide total plants: 345
81 plant species
322 Hemiptera species
USDA Plants List:
(Federally listed Endangered or Threatened)
North American total plants: 752
31 plant species
127 Hemiptera species

Heteroptera have declining plant hosts
enicospilus@gmail.com
Insect
Plant
1
2

Conclusions
enicospilus@gmail.com
0.0 0.2 0.4 0.6 0.8 1.0
0246810
p(x|y)
#InsectSpecies
red-listed
not red-listed
In this dataset (31 hemipteran
species collected from USDA red-
listed plants and all of their other
known hosts)
there is a higher probability that the
insect was collected on a
red-listed host than non red-listed.
19 species have a > 10%
probability that they were
collected on a federally
Endangered or Threatened
Plant.
9 species are only known to be
collected on a red-listed species.

• Continue to explore new methods for examining
this data (Data Science: Ontology & Machine
Learning).
– Explore data bias of collectors by adding Collector into
the equation of p(x|Y).
– Include a third trophic level (parasitoids) into the data
analysis.
– Expand to world plant host list and other insect
records outside of Hemiptera.
– Include known phylogenies of insect and plant.
Future directions
enicospilus@gmail.com

Acknowledgements
•TTD-TCN project PIs, digitizers and
managers
•Randall T. Schuh
•National Science Foundation
•iDigBio and www.datacarpentry.org
•Museum collections and curators
worldwide

What's hot

3.3 from phenetics to phylogeniesscardonar

Hertweck Evolution 2017Kate Hertweck

MatosSchalee.nvv090.fullYvonne K. Matos

Research PresentationJordan Wolfe

TAIR Presentation ASPB 2017Phoenix Bioinformatics

PENSOFT ARTICLE COLLECTION ABOUT MYANMARMYO AUNG Myanmar

Prevalence of E. coli and Salmonella in St. Kitts Retail Chicken and Pork Pro...Nicholas Mills

Murphy_Watt Mosquito PosterShane Murphy

897nreferat

Buttefly in usc tcJulius Manolong

Garlic Mustard DemographyUSDAweeds

Leucaena in ParaguayTropical Forages Program

SeniorCapstone_DiTullio Stephan DiTullio

Detection of Campylobacter carriage rate in different poultry production syst...ILRI

Poster_BTJ_FinalB. Tegner Jacobson

Genetic applicationsnissangoldberg

PerkinElmer 2019 Applied Genomics CalendarPKIAG

Populationsarnoldcl

Evauation of-select-pathogen-reduction-wolfgangIIAD

Ecoogical studies viewRana Salah-ud-Din

What's hot (20)

3.3 from phenetics to phylogenies

Hertweck Evolution 2017

MatosSchalee.nvv090.full

Research Presentation

TAIR Presentation ASPB 2017

PENSOFT ARTICLE COLLECTION ABOUT MYANMAR

Prevalence of E. coli and Salmonella in St. Kitts Retail Chicken and Pork Pro...

Murphy_Watt Mosquito Poster

897

Buttefly in usc tc

Garlic Mustard Demography

Leucaena in Paraguay

SeniorCapstone_DiTullio

Detection of Campylobacter carriage rate in different poultry production syst...

Poster_BTJ_Final

Genetic applications

PerkinElmer 2019 Applied Genomics Calendar

Populations

Evauation of-select-pathogen-reduction-wolfgang

Ecoogical studies view

Viewers also liked

The structure of insect—plant host data as derived from museum collections: ...Katja C. Seltmann

Building the Hymenoptera Anatomy Ontology through exploration of the Journal ...Katja C. Seltmann

2010 june secretary reportKatja C. Seltmann

Ish websiteKatja C. Seltmann

You the Charmer, 2011. Katja C. Seltmann

GigaPan megapixel imaging and best practices for digitizing entomological col...Katja C. Seltmann

Referencial Curricular Nacional para Educação Infantil Vol 3Clarisse Bueno

Viewers also liked (7)

The structure of insect—plant host data as derived from museum collections: ...

Building the Hymenoptera Anatomy Ontology through exploration of the Journal ...

2010 june secretary report

Ish website

You the Charmer, 2011.

GigaPan megapixel imaging and best practices for digitizing entomological col...

Referencial Curricular Nacional para Educação Infantil Vol 3

Similar to Recreating biomes one label at a time

Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...taxonbytes

Crop Wild Relatives On the RiseDecision and Policy Analysis Program

5 APRIL 2013 VOL 340 SCIENCE www.sciencemag.org 32.docxalinainglis

Considering Optimism and Pessimism in ConservationMark Gibson

Chapter4-Systematics.pptaprilrances1

Insect decline in AnthopoceneJoão Soares

Sierra club chestnuts presentation 2018Richard Gardner

Prevalence and resistance of bacterial strains isolated from chicken beddings...IOSRJAVS

The Garden Of EdenArtSci_center

H177 Midterm DizonVictoria Vesna

Deocareza populationecologycarlo2307

Deocareza population ecology-1231427563650176-1 (1)carlo2307

GMO lectureLoren Goodrich

Chapters 8 11 ecologyLeandro Michael Delos Santos Jr.

Reciprocal Coevolution And Plant-Pollinator Interactions...Jennifer York

De-Extinction Of Extinct AnimalsPay To Do Paper Wooster

Extinction Of Bees EssayBest Paper Writers Louisiana State University Health Sciences Center-

The Edge of Tomorrow — Plant Health in the 21st CenturySophien Kamoun

L37 gedrag van planten, kan dat wel theo elzengaTycho Malmberg

Rufus plant microbe interactionsPlant Disease Control Hub

Similar to Recreating biomes one label at a time (20)

Cobb, Seltmann, Franz. 2014. The Current State of Arthropod Biodiversity Data...

Crop Wild Relatives On the Rise

5 APRIL 2013 VOL 340 SCIENCE www.sciencemag.org 32.docx

Considering Optimism and Pessimism in Conservation

Chapter4-Systematics.ppt

Insect decline in Anthopocene

Sierra club chestnuts presentation 2018

Prevalence and resistance of bacterial strains isolated from chicken beddings...

The Garden Of Eden

H177 Midterm Dizon

Deocareza populationecology

Deocareza population ecology-1231427563650176-1 (1)

GMO lecture

Chapters 8 11 ecology

Reciprocal Coevolution And Plant-Pollinator Interactions...

De-Extinction Of Extinct Animals

Extinction Of Bees Essay

The Edge of Tomorrow — Plant Health in the 21st Century

L37 gedrag van planten, kan dat wel theo elzenga

Rufus plant microbe interactions

Recently uploaded

Pests of soyabean_Binomics_IdentificationDr.UPR.pdfPirithiRaju

The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar

Four Spheres of the Earth Presentation.pptJoemSTuliba

《Queensland毕业文凭-昆士兰大学毕业证成绩单》rnrncn29

User Guide: Magellan MX™ Weather StationColumbia Weather Systems

Citronella presentation SlideShare mani upadhyayupadhyaymani499

GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests GlycosidesNandakishor Bhaurao Deshmukh

Biological classification of plants with detailhaiderbaloch3

STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptxMurugaveni B

Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju

PROJECTILE MOTION-Horizontal and VerticalMAESTRELLAMesa2

ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptxmaryFF1

Let’s Say Someone Did Drop the Bomb. Then What?LUMINATIVE MEDIA/PROJECT COUNSEL MEDIA GROUP

OECD bibliometric indicators: Selected highlights, April 2024innovationoecd

Quarter 4_Grade 8_Digestive System Structure and FunctionsCharlene Llagas

User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems

CHROMATOGRAPHY PALLAVI RAWAT.pptxpallavirawat456

REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...Universidade Federal de Sergipe - UFS

Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix

Speech, hearing, noise, intelligibility.pptxpriyankatabhane

Recently uploaded (20)

Pests of soyabean_Binomics_IdentificationDr.UPR.pdf

The dark energy paradox leads to a new structure of spacetime.pptx

Four Spheres of the Earth Presentation.ppt

《Queensland毕业文凭-昆士兰大学毕业证成绩单》

User Guide: Magellan MX™ Weather Station

Citronella presentation SlideShare mani upadhyay

GLYCOSIDES Classification Of GLYCOSIDES Chemical Tests Glycosides

Biological classification of plants with detail

STOPPED FLOW METHOD & APPLICATION MURUGAVENI B.pptx

Pests of castor_Binomics_Identification_Dr.UPR.pdf

PROJECTILE MOTION-Horizontal and Vertical

ECG Graph Monitoring with AD8232 ECG Sensor & Arduino.pptx

Let’s Say Someone Did Drop the Bomb. Then What?

OECD bibliometric indicators: Selected highlights, April 2024

Quarter 4_Grade 8_Digestive System Structure and Functions

User Guide: Orion™ Weather Station (Columbia Weather Systems)

CHROMATOGRAPHY PALLAVI RAWAT.pptx

REVISTA DE BIOLOGIA E CIÊNCIAS DA TERRA ISSN 1519-5228 - Artigo_Bioterra_V24_...

Base editing, prime editing, Cas13 & RNA editing and organelle base editing

Speech, hearing, noise, intelligibility.pptx

Recreating biomes one label at a time

1. Recreating biomes one label at a time Katja C. Seltmann American Museum of Natural History enicospilus@gmail.com

2. Human-mediated Disturbance Our great loss in biodiversity that is difficult to calculate. Many efforts exist to sample a snapshot of present biodiversity as the world is rapidly changing from increased human activity. enicospilus@gmail.com

3. Historical Ecology What we grow up with today ends up as the baseline for our viewpoint of “nature” in the future. enicospilus@gmail.com

4. Myzocallis (Myzocallis) castaneae Hemiptera have declining plant hosts enicospilus@gmail.com Reuteria querci Telamona reclivata Platycotis vittata Atymna castaneaeAtymna querci Ophiderma flava Cyrtolobus maculifrontis Archasia auriculata Myzocallis (Neomyzocallis) punctata

5. Calculating host specificity enicospilus@gmail.com

6. enicospilus@gmail.com Role of natural history collections

7. Digitization of Hemiptera enicospilus@gmail.com Present Dataset: Over 1,011,627 Total Specimens Date Range: 1811 – Present 124 total insect families Unique collecting events/ insect family

8. IUCN red list status enicospilus@gmail.com 0.52% species of insects evaluated

9. Heteroptera have declining plant hosts enicospilus@gmail.com IUCN Red List: Worldwide total plants: 345 81 plant species 322 Hemiptera species USDA Plants List: (Federally listed Endangered or Threatened) North American total plants: 752 31 plant species 127 Hemiptera species

10. Heteroptera have declining plant hosts enicospilus@gmail.com Insect Plant 1 2

11. Conclusions enicospilus@gmail.com 0.0 0.2 0.4 0.6 0.8 1.0 0246810 p(x|y) #InsectSpecies red-listed not red-listed In this dataset (31 hemipteran species collected from USDA red- listed plants and all of their other known hosts) there is a higher probability that the insect was collected on a red-listed host than non red-listed. 19 species have a > 10% probability that they were collected on a federally Endangered or Threatened Plant. 9 species are only known to be collected on a red-listed species.

12. • Continue to explore new methods for examining this data (Data Science: Ontology & Machine Learning). – Explore data bias of collectors by adding Collector into the equation of p(x|Y). – Include a third trophic level (parasitoids) into the data analysis. – Expand to world plant host list and other insect records outside of Hemiptera. – Include known phylogenies of insect and plant. Future directions enicospilus@gmail.com

13. Acknowledgements •TTD-TCN project PIs, digitizers and managers •Randall T. Schuh •National Science Foundation •iDigBio and www.datacarpentry.org •Museum collections and curators worldwide

Editor's Notes

At the present time we are experiencing a great loss in biodiversity. However, the amount of loss is difficult to calculate. Many efforts exist to sample a snapshot of that diversity now (inventory studies, hotspot conservation and DNA tissue banks) as the world is rapidly changing from increased human activity.
There is a budding field of ecology known as “historical ecology”, where we recognize that our conception of nature is changing with every new generation. We tend to accept what is natural around us as we grow up as a generational baseline for conservation. Examples include how New Yorkers consider Central Park a natural area, or the English vision of the naturalness of the English countryside, to the ever-changing view regarding the ideal composition of our national forests. If we only look at what is presently found in a given environment, we miss the historical ecology of an area and its future potential for conservation and restoration. Notes: (what do you eat for breakfast). Our conception of nature / natural world changes with each generation.
We are aware of modifications in our natural environment typically on a very gross scale. For example, the American Chestnut, Castanea dentata, is a large, monoecious deciduous tree of the beech family native to eastern North America. Before the species was devastated by the chestnut blight, a fungal disease, it was one of the most important forest trees throughout its range. Along with the devastation of the majestic chestnut, all of its animal associates, also had to adjust to the resulting modification to the forests. The question is how can we look into the past, in order to reveal how some of these changes occurred. In this analysis I focused on one group of insects. The Hemiptera (e.g., cicadas, aphids, planthoppers, assassin bugs, milkweed bugs, leafhoppers, treehoppers, plant bugs, stink bugs, and many others) is a highly diverse order of insects, with an estimated 100,000 species worldwide, and around 11,150 species documented for North America. About 85% of Hemiptera feed on plants by directly piercing tissues, and many show a high degree of plant host specificity. The analysis took advantage of the data collection efforts of two major NSF funded projects focused on Hemiptera, The Plant Bug PBI, and the Tri-Trophic TCN, that together aggregated one of the most comprehensive specimen datasets for any order of insect. This dataset, termed the &quot;Hemipteran Dataset&quot;, represents over 1.5 million specimen data records digitized from 190 different natural history collections, 141 hemipteran families, 310 host plant families, 413,400 recorded species interactions, over 1200 habitat descriptions, and contain historical specimens collected between 1890 to the present. This data, obtained from natural history collection specimen labels, contain information about the specimen at the time it was collected including specimen sex, life history stage, phenology, collection location, habitat, and host plant. The Hemipteran Dataset is almost unique in its comprehensive coverage for one group of insects in this regard.
Hemiptera consist of 50-80 thousand species worldwide, many of which use a proboscis to feed on plants. The Aphididae and the Plant Bugs, in the family Miridae, are two diverse families within the order, both of which commonly feed on plant hosts. As the favorite host species declines, host switching may occur as some hemipterans have diverse feeding habits. However, it is well known from the literature and discussions with domain experts, that some groups of insects are not able to host switch so readily. The challenge is to calculate the probability, utilizing literature and specimen data records, to determine if a host switching event could occur, or the probability that given a certain insect (x) found on plant (y). And the higher the probability is an indication that the insect may not have the ability to host switch.
Dealing with messy data requires leaning toward probability calculations and the data we are collecting from the efforts to digitize natural history collections are beautifully messy data. Natural history collections are our window into the ecology of the past, but we have a grand challenge that the data are non-standard, inconsistent, not in a digital format, and difficult to summarize. Error is introduced in the data either from the collecting and curation methods, miss-identification, or unsubstantiated observation in the literature. When dealing with rare events, such as collecting on endangered plants, parsing out these errors can be a challenge. The data is known to be difficult, but methods do exist, specifically in the computer science field of “machine learning” to help us deal with heavily biased data.
The hemipterian dataset contains over a million records. All digitized as part of one of the first Thematic Collection Network projects, the Plants, Herbivores, and Parasitoids: A Model System for the study of Tri-Trophic Associations and a Planetary Biodiversity Inventory project for plant bugs. The data includes 124 total hemipterian families, ranging in date from 1811-present.
Reality that the amount of data deficiency in calculating biodiversity loss on non-keystone species is a great challenge. Insects are some of the most numerous organisms on the planet, with unequal biodiversity, of which we have no idea the impact in our changing world. These numbers retrieved from the International Union for Conservation of Nature (IUCN) website indicate the number of evaluated species for each animal class. Mammals, birds and reptiles have been comparatively easier to evaluate, likely due to large body size. The most interesting number in this calculation however is the “data deficient” column. It seems that the majority of the 950,000 described species would predominately fall in this category, however it is unlikely that they will ever be evaluated.
Many plants however have been evaluated by the IUCN as well as the USDA. If we compare the entire million plus hemipterian data records to these red lists we see: For IUCN Worldwide total plants: 345 plant species 322 Hemiptera species For USDA North American total plants: 752 31 plant species 127 Hemiptera species The network diagram in the back ground represents the hemipterian data network for all associated USDA red listed plants, plus any other plant that insect is recorded to be as a host. The blue balls are insect nodes (species) and the green ones are plant nodes (species). Every line in between is a connection between an insect and a plant. If we zoom in…
We can begin to see some trends. 1): Some plants are linked to many insects. 2) We can see that some insects are linked to many plants. We want to understand the relative importance of a red-listed plant by the number of insects for which it is a host, and how opportunistic an insect is by how many plants are recorded hosts for that insect. We need to take into account how many times the insect was collected on any plant, as well as the number of collecting events on red-listed plants.
We then can calculate the probability, given our subset of data (red-listed USDA plants, and all insects observed to associate with those plants). Only multiple independent collecting events (i.e. singletons removed) were used in analysis. For the 127 species of Hemiptera In this dataset (31 insects collected from USDA red-listed plants and all of their other known hosts) there is a higher probability that the insect was collected on a red-listed host than non red-listed. 19 species have a &gt; 10% probability that they were collected on a federally Endangered or Threatened Plant. 9 species are only know to be collected on a red-listed species.
Analysis methods beyond those that are conventionally applied in biodiversity research will be needed in order to extract meaningful information from these natural history collection data. Fortunately, research in Data Science has rapidly matured and been applied in other areas that involve large quantities of information (social media and genomics). Computer Science methods that include machine learning and ontology have yielded a variety of new techniques for pattern recognition, data quality assessment, and trend analysis. In very general terms, data science refers to the extraction of knowledge from data. Ontology and machine learning are methods and areas of research in computer science that are significantly associated with data science. Machine learning refers to the development of algorithms to facilitate pattern recognition, classification, and prediction, based on models derived from existing data. Ontology, or ontological reasoning, can be defined as the process of inferring information about the organization of descriptive terms utilizing structured and explicitly defined dictionaries. In the future, I plan to continue to explore the human impacts on biodiversity, by taking bold steps toward the cross-pollination of new methods between biology, informatics, and computer science.
Many people made this effort possible. Efforts of museum curators, collection managers and collectors world-wide.

Recreating biomes one label at a time

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (7)

Similar to Recreating biomes one label at a time

Similar to Recreating biomes one label at a time (20)

Recently uploaded

Recently uploaded (20)

Recreating biomes one label at a time

Editor's Notes