Linking systems to improve data quality

335 views

Published on

My talk at TDWG 2013 showing our work on linking different biodiversity information systems to improve the quality of the data on both sides. We enabled an interoperability workflow between data aggregators, such as GBIF or VertNet, and other biodiversity information aggregators, like Map Of Life, with information such as IUCN expert range maps, regional checklists or gridded surveys. Although this was a work in progress at the time of the presentation (therefore the "warning" sign), we were able to show that intercommunication between both systems allowed us to detect spatio-taxonomic biases and issues in both sources. We also explored the possible causes for those errors and tried to model the error rates, finding that new data, published through new mechanisms showed better error rates. We concluded that even though we still lack more work to get a deeper understanding, we believe that we are getting into a new age of biodiversity information sharing, where quality, and not that much quantity, is becoming the key feature. We also believe that the Integrated Publishing Toolkit (IPT), developed by GBIF, might be the banner of this new movement towards a better quality data sharing and that it might be because it is an easier-to-use tool, because building auxiliary tools and mechanisms for improving the quality is easier, or simply because people are getting aware of the importance of having a good quality data set.

Published in: Education, Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
335
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Linking systems to improve data quality

  1. 1. Linking systems to improve data quality Javier Otegui and Rob Guralnick
  2. 2. >210M occurrence points + All country boundaries + All IUCN range maps ---------------------------------9 different spatial and spatio-taxonomic issues Causes??
  3. 3. >210M occurrence points + All country boundaries + All IUCN range maps ---------------------------------9 different spatial and spatio-taxonomic issues Causes??
  4. 4. Time Less records without coordinates More records inside range map Source Most populated: IPT Less issues in data: IPT Some values Proportion of terrestrial vertebrate records with any issue: Amphibia 12% Aves 14% Mammalia 17% Reptilia 14% Proportion of terrestrial vertebrate records inside range maps: Amphibia 30% Aves 69% Mammalia 37% Reptilia 7% Proportion of terrestrial vertebrate records without range maps: Amphibia 50% Aves 7% Mammalia 38% Reptilia 76%
  5. 5. Thank you!

×