Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open data and the ag data commons

103 views

Published on

Webinar presentation by Cyndy Parr and Erin Antognoli hosted by Hunger Solutions Institute (HSI) and Presidents United to Solve Hunger (PUSH) at Auburn University on April 25, 2019.

Published in: Technology
  • Be the first to comment

Open data and the ag data commons

  1. 1. Open Data and The Ag Data Commons Presented by Cyndy Parr & Erin Antognoli April 25, 2019 1
  2. 2. Agenda Open data ● Definition and basics Ag Data Commons ● USDA research data catalog ● Open agricultural data National Agricultural Library services ● Data dictionaries ● Data management plans 2
  3. 3. Open Data The basics and background 3
  4. 4. Open data policy history 2013 - Obama administration’s open data policy memo Directs all federal agencies to publish their information as machine-readable data, using searchable, open formats Required every agency to maintain a centralized Enterprise Data Inventory that lists all data sets Mandated a centralized inventory for the whole government – the platform currently known as data.gov 2019 - OPEN Government Data Act becomes law https://project-open-data.cio.gov/policy-memo/ https://www.congress.gov/bill/115th-congress/house-bill/4174/text 4
  5. 5. Public access policy history 2013 - “Holdren memo” issued by Office of Science and Technology Policy 2014 - USDA Implementation Plan approved 2016 - USDA Public Access Policy for Scholarly Publications approved ● CHORUS will provide access to many published articles ● Submission of accepted manuscripts to PubAg (pubag.data.nal.gov) is imminent 2019 - Anticipate approval of USDA Public Access Policy for Digital Scientific Data https://go.usa.gov/xmB9a https://go.usa.gov/xmB92 5
  6. 6. Open data is... “...data that can be freely used, re-used and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike.” ~ Open Data Handbook Why is a clear definition of open data important? Interoperability - different datasets should be able to work together ● Availability and access ● Re-use and redistribution ● Universal participation http://opendatahandbook.org/guide/en/what-is-open-data/ 6
  7. 7. Availability and Access “The data must be available as a whole and at no more than a reasonable reproduction cost, preferably by downloading over the internet. The data must also be available in a convenient and modifiable form.” http://opendatahandbook.org/guide/en/what-is-open-data/ 7
  8. 8. Re-use and Redistribution “The data must be provided under terms that permit re-use and redistribution including the intermixing with other datasets.” http://opendatahandbook.org/guide/en/what-is-open-data/ 8
  9. 9. Universal Participation “Everyone must be able to use, re-use and redistribute - there should be no discrimination against fields of endeavour or against persons or groups. For example, ‘non-commercial’ restrictions that would prevent ‘commercial’ use, or restrictions of use for certain purposes (e.g. only in education), are not allowed.” 9
  10. 10. FAIR principles reinforce open data Findable Accessible Interoperable Reusable FINDABLE Rich metadata Persistent identifiers INTEROPERABLE Open formats Common metadata standards Controlled vocabularies REUSABLE Usage license Provenance Community standards ACCESSIBLE Fixity Data & metadata available to target audience FAIR Principles https://www.force11.org/group/fairgroup/fairprinciples 10
  11. 11. Ag Data Commons USDA open agricultural data 11
  12. 12. The Ag Data Commons is... ● A catalog and data repository for open agricultural research data ● The catalog for all USDA-funded research data ● Satisfies the federal open data requirements ● Satisfies the USDA public access requirements https://data.nal.usda.gov/ 12
  13. 13. Ag Data Commons collection policies Ag-related data ● Many high-level categories - i.e. Agroecosystems & Environment, Agricultural Economics, Bioenergy, Agricultural Products, etc. USDA Funding ● USDA-funded data or data from USDA researchers working on collaborative projects DOI ● Assigned for locally held resources Version policy https://data.nal.usda.gov/ 13
  14. 14. Ag Data Commons features Groups by project or affiliation ● Programs can request a tag to keep all their data entries grouped together ● Data hierarchies one level deep supported (parent / child) ORCID integration ● Authors can link to their profiles to prevent ambiguity Citations ● Specify a citation for your own data ● Link to scholarly publications or data papers / PubAg ● Link to other related data content https://data.nal.usda.gov/ 14
  15. 15. Submission limitations Data should have ties to USDA ● Funder, collaborator, or employer File size - 20 GB per file max ● Larger size data storage pilot underway! No executables allowed ● Executables can be cataloged with a pointer to the software/code, but not deposited directly https://data.nal.usda.gov/ 15
  16. 16. Submit ag-related data Create an account ● https://data.nal.usda.gov/user/register Data submission form ● Metadata entry ● Workflow tools ● Clone metadata ● Separate descriptions for each resource file Metadata - Project Open Data ● Open standard ● Formatted for ingest into data.gov ● https://project-open-data.cio.gov/ schema/https://data.nal.usda.gov/ 16
  17. 17. Data dictionaries Advancing open data through transparency and reusability 17
  18. 18. A data dictionary is... … a collection of descriptions of the data objects or items in a dataset or model for the benefit of programmers and others who need to refer to them. 18
  19. 19. Ag Data Commons supports data dictionaries Encouraged as part of catalog entry in the Ag Data Commons ● A special designation for data dictionary resources in the submission form ● CSV format preferred, other machine-readable formats accepted 19
  20. 20. NAL offers data dictionary resources Ag Data Commons submission manual ● https://data.nal.usda.gov > under the About tab ● Instructions for automatic and manual generation ● Blank template Data dictionary webinars ● National Agricultural Library YouTube channel ● Link under the Ag Data Commons “About” tab Direct questions / advice / help ● NAL-ADC-Curator@ars.usda.gov 20
  21. 21. Data Management Plans More steps toward open data 21
  22. 22. DMPs are required for USDA funding proposals USDA funding proposals now require a DMP There is a specific format for NIFA DMP - 2 pages with 5 sections* ● Expected data types ● Data formats (and standards) ● Data storage and preservation (of access) ● Data sharing, protection, and public access ● Roles and responsibilities *Note: Other agencies or institutions may require a different format 22
  23. 23. NAL assists with DMPs USDA DMP guide ● https://www.nal.usda.gov/ks/guidelines-data-management-planning NAL provides DMP draft review ● USDA researchers and collaborators can send their drafts to NAL-ADC-Curator@ars.usda.gov for review DMP Webinars ● National Agricultural Library YouTube channel ● Linked under the Ag Data Commons “About” tab 23
  24. 24. Other resources at NAL Webinars ● Recordings available publicly on the NAL YouTube channel ● Anyone may join future webinars - email NAL-ADC-Curator@ars.usda.gov to be added to the list Ag Data Commons site ● Submission manual, policy pages, etc., all linked under the “About” tab PubAg ●https://pubag.nal.usda.gov/ Knowledge Services website ● https://www.nal.usda.gov/ks 24
  25. 25. Summary Open data ● Required for federal research ● Available and accessible for reuse and redistribution ● FAIR principles - Findable, Accessible, Interoperable, Reusable Ag Data Commons ● USDA’s catalog for ag research data ● Agricultural data submissions Guidelines and assistance at NAL ● Data dictionaries ● Data management plans 25
  26. 26. Questions? NAL-ADC-Curator@ars.usda.gov 26

×