Introductory talk of a symposium on Agrobiodiversity informatics at the 2016 annual meeting of the Biodiversity Information Standards. Begins with an overview of the symposium and its speakers, and then launches into my talk.
2. Plan for this symposium
• Cyndy Parr - Overview
• Dimitry Schigel – Sampling event standardization
• Gail Kampmeier – Arthropod data gaps
• Nicole Kaplan – Long-term Agroecosystem Research
• Ramona Walls – Plant contextual metadata
• Panel discussion
2
3. Credit: Phenocam USDA-ARS Hawbecker Farm, PA
Cynthia Parr @cydparr
US Department of Agriculture
National Agricultural Library
6 December 2016
Biodiversity informatics and
the agricultural data landscape
8. GEOGLAM
8
Driver data
[NOAA- climate, weather;
NRCS- land use, soils; ARS-
invasives, APHIS- pests,
pathogens]
Crop and range production (area,
yield by type, land use, location)
observations, experiments, land use, management
strategies
- forecast vulnerability
of agrosystems to
global change drivers
-develop potential
adaptation strategies
GOAL: Regional to
national- level synthesis
and integration
Feedbacks to data and
knowledge gaps
INFORMATION
TRANSFER
Land managers,
public, decision
makers
Climate hubs
Statistical
models
Site-level synthesis
Simulation
models
(AgMIP,
others)
Conceptual models
imagery
NASA, ARS
Iegacy data
LTAR, ARS,
NASS,
ERS, NRCS
Short-term data
FSA, ARS,
NASS, NIFA,
GRA
US GLAM – links imagery with ground
observations across contiguous US and globally
Credit: Deb Peters
12. Use case 1: Coping with climate change
in wildlife conservation
• Lupita is a park ranger
• Some park species threatened (perhaps by
agriculture)
• Latest projections are hotter and wetter in 50
years
• She has limited resources
• Which species might be at risk and what are her
options for mitigation?
From: Thessen et al. (2016)
13. Use case 2: Coping with climate change in
agriculture
– Steve is a scientist working for a seed company
– Farmers need crop hybrids that perform well in
projected drier, warmer climates
– Drought tolerant, have high, stable yields, suitable for
local soil
– Emerging diseases, invasive pests, and threats to
pollinators
– Which wild species and varieties to include in
breeding program?
13
From: Thessen et al. (2016)
14. What does this mean?
• Leveraging big data resources together
• Share best practices and interoperable systems
• Use cases crossing both ag and biodiversity
14
15. Shared agenda
• Identify and address gaps in standards and
services for machine-readable data dictionaries,
thesauri, and ontologies.
• Strengthen the use of Globally Unique Identifiers
• Ride public access mandates and advances in
high performance computing to promote text and
data mining and modeling.
UN SUSTAINABLE DEVELOPMENT GOALS 12 AND 15: FEED THE WORLD WHILE REDUCING IMPACTS ON CLIMATE, SOIL, WATER, AND BIODIVERSITY
GET THE MOST VALUE OUT OF BOTH OLD AND NEW DATA
GBIF FITNESS FOR USE RECOMMENDATIONS FOR AGROBIODIVERSITY
STILL SOME DISTANCE TO GO, SO WE ORGANIZED THIS SYMPOSIUM
ABSTRACT
Symposium 09: Agricultural Biodiversity Standards and Semantics
Standards for managing information about Agricultural Biodiversity are a critical societal need. Among the UN's Sustainable Development Goals are numbers 12 and 15, which call for feeding the world while reducing impacts on climate, soil, water, and biodiversity. In order to achieve these goals, we must get the most value out of both old and new data and information. The Global Biodiversity Information Facility recently released data fitness for use recommendations for agrobiodiversity, and research by Bioversity International and the USDA Agricultural Research Service and others on conserving genetic biodiversity of crop plants is widely recognized in importance. However, other types of biodiversity are less appreciated by traditional agricultural scientists, and they do not contribute their data to repositories for “traditional” biodiversity inventories. Yet they have a wealth of data, often collected over years in field experiments designed to show differences in how cropping systems affect fauna (herbivores, pollinators, and predators) and their phenology. Discovering and re-purposing the enormous quantities of data produced by the agricultural research community would require a certain modification of systems. What standards and ontologies would be required and what is missing from them? How can we mine existing datasets or motivate agricultural scientists to contribute new data to the global conversation? Integrating biodiversity data with various crops, cropping systems, and cultural practices could well provide insights for pollination services, predator/prey dynamics, epidemiology of plant viruses, and long distance movement of pests. Recent developments in agricultural semantics (e.g. the Global Agricultural Concept Scheme, the Plant Ontology) and commitments for open data (GODAN Global Open Data for Agriculture and Nutrition http://godan.org) and several new data repositories present opportunities. Speakers in this symposium will pursue synergies between biodiversity information standards and ontologies and the blossoming world of agricultural data management.
AS SOME OF YOU MAY KNOW, OVER TWO YEARS AGO I CHANGED JOBS. I HAD BEEN WORKING FOR THE SMITHSONIAN INSTITUION ON THE EOL PROJECT. NOW I WORK FOR
HAVE SPENT THE LAST YEAR LEARNING ABOUT THIS NEW FIELD
My abstract
Biodiversity informatics and agricultural data management landscape Historically, ecological and biodiversity researchers have focussed on the basic patterns and processes of populations, communities, and ecosystems, with minimal attention paid to the role of humans. Human impacts have been instead been addressed in the more applied sciences of conservation, medicine, and agriculture. In recent years, however, boundaries between applied and basic sciences have blurred. There is general recognition that our future is best served by science that seeks to understand systems in their true, full contexts. Societies cannot live sustainably without an understanding of the biosphere and how humans and their behavior and management practices might impact it. Data infrastructure (e.g. data management systems, metadata standards, ontologies) must therefore accommodate use cases that span managed and "pristine" systems. In this talk we describe the challenges faced by agricultural research communities that share some domain-specific data needs with basic biodiversity and ecology research communities, but that also share needs with the social science and biomedical communities. Big data in agriculture involves both real-time environmental and high throughput genomics and phenomics. Long-term data includes social science surveys, repeated crop rotation experiments, and basic monitoring of soil and water and weather conditions. Battling emerging pests or adapting cropping or ranching activities to climate change requires an understanding of wild relatives and microbial ecology. We sketch out a landscape of loosely coupled data and analysis infrastructures and policies that are being developed to address these challenges, with special focus on the United States. Some parts of this landscape are centered at the US National Agricultural Library (NAL), e.g. the Ag Data Commons, i5K workspace, Life Cycle Assessment Commons. Other parts are led elsewhere in the US Department of Agriculture (Long Term Agroecosystem Research initiative, National Institute of Food and Agriculture's data science program). Other government agencies, universities, and private organizations all play critical roles. Some parts of the landscape are already familiar to the biodiversity informatics community but agricultural use cases can help all of us work together on best practices and interoperable systems. Collectively, we can identify and address gaps in standards and services for machine-readable data dictionaries, thesauri, and ontologies. We can strengthen the use of Globally Unique Identifiers and ride public access mandates and advances in high performance computing to promote text and data mining and modeling. We can build a living knowledge landscape that serves and promotes both basic and applied research.
Historically, ecological and biodiversity science researchers have paid minimal attention to the role of humans. If you were studying human impacts your work was seen as more applied.
In recent years, boundaries between applied and basic sciences have blurred. Now general recognition that our future is best served by science that seeks to understand systems in their true, full contexts.
These are just a relatively random set of examples I’ve drawn up for illustration purposes. There’s no obvious flow in these examples from left to right. There are many more examples I could have used
If you look closely you might argue with where I’ve put things, for example, you might argue that after users do their resesach they andalyze share their data and THEN it could get standardized and a different kind of analysis could come both before and after that.
Answering questoins may just be another way of solving problems.
You could put ontologies in this same Vocabs and standards, and the Analysis might involve semantic reasoning.
Just take the framework right now as the important take away here.
Let’s do the same thing One thing is a bit obvious – logos aren’t as pretty.
But seriously, I’m still learning myself about all these projects. But there are clear parallels. I want to focus on some of these in this column which is where I tend to spend most of my time.
http://science.sciencemag.org/content/354/6316/twil
Asian LongHorned beetle genome
Genome Biol. 17, 227 (2016).
Asian longhorned beetle larvae digest wood by using acquired genes.
PHOTO: SCOTT CAMAZINE / ALAMY STOCK PHOTO
Group on Earth Observations, global Agricultural monitoring
Taking agricultural data on the upper left, much of it from USDA and/or remote sensing
Combinging with driver data on climate, weather, land use, invasives, pests and pathogens
Putting together and doing some modeling.
In order to provide regional or national forecasts on vulnerability of agriculture to drivers and to develop adaptation strategies. Then transfer that info out to the people who need it AND
Provide feedbacks back to the data providers about problems or gaps.
Looks like a mess but the value of this exercise is as follow.
Each of the things in each bin automatically shares a lot, at least conceptually. Could they share more? Vocabularies? Data exchange formats? Data quality approaches, Policies, etc?
How do we foster better connections across the columns, both within and across domains?
User Story 1: Coping with climate change in wildlife conservation
Lupita is a park ranger who manages a coastal wildlife sanctuary. Some of the species in her sanctuary are listed as threatened by the IUCN. According to the latest climate change projections, her sanctuary is going to be hotter and wetter in 50 years. She has limited resources to maintain the biodiversity in her sanctuary for the long term. She needs to identify which species might be at risk under the projected future climate regime and consider her options for mitigating that risk.
User Story 2: Coping with climate change in agricultureSteve is a scientist working for a seed company. He wants to develop crop hybrids that perform well in the drier, warmer climates predicted for the next 30 years in the region of the country that he serves. He knows that farmers will want to plant crops that are drought tolerant and have high, stable yields. These crops must also be suitable for local soil conditions and sometimes rapidly changing factors such as emerging diseases, invasive pests, and threats to pollinators. He needs to identify promising species and varieties so he can include them in his breeding program.
Wht does this mean for data?
Data infrastructure (e.g. data management systems, metadata standards, ontologies) must therefore accommodate use cases that span managed and "pristine" systems. In this talk we describe the challenges faced by agricultural research communities that share some domain-specific data needs with basic biodiversity and ecology research communities, but that also share needs with the social science and biomedical communities.
Mention the use case in the recent chapter that Anne and I wrote
Point to some big data talks later in the day