Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

FAIR Metrics - Presentation to NIH KC1

190 views

Published on

This is a short presentation about the FAIR Metrics Evaluator - software that automates the application of FAIR Metrics against a given resource, in order to determine its degree of "FAIRness"

Published in: Internet
  • Be the first to comment

FAIR Metrics - Presentation to NIH KC1

  1. 1. FAIR Metrics & Evaluation Mark D. Wilkinson, Susanna-Assunta Sansone, Erik Schultes, Peter Doorn, Luiz Olavo Bonino da Silva Santos, Michel Dumontier
  2. 2. What makes a “good” Metric? (for anything!)
  3. 3. What makes a “good” Metric? (for anything!) ● Clear: so that anybody can understand what is meant ● Realistic: so that anybody can report on what is being asked of them ● Discriminating: so that we can distinguish the degree to which a resource meets a specific FAIR principle, and can provide instruction as to what would maximize that value ● Measurable: The assessment can be made in an objective, quantitative, machine-interpretable, scalable and reproducible manner → transparency of what is being measured, and how. ● Universality: The extent to which the metric is applicable to all digital resources.
  4. 4. Rubric for designing a Metric We designed a set of parameters that must be considered for every metric. The parameters are designed to help ensure that the Metric you are designing is “good”
  5. 5. Metric Identifier FAIR Metrics should, themselves, be FAIR objects, and thus should have globally unique identifiers. Metric Name human-readable name for the metric To which principle does it apply? Metrics should address only one sub-principle, since each FAIR principle is particular to one feature of a digital resource; metrics that address multiple principles are likely to be measuring multiple features, and those should be separated whenever possible. What is being measured? A precise description of the aspect of that digital resource that is going to be evaluated Why should we measure it? Describe why it is relevant to measure this aspect What must be provided? What information is required to make this measurement? How do we measure it? In what way will that information be evaluated? What is a valid result? What outcome represents "success" versus "failure" For which digital resource(s) is this relevant? If possible, a metric should apply to all digital resources; however, some metrics may be applicable only to a subset. In this case, it is necessary to specify the range of resources to which the metric is reasonably applicable.
  6. 6. Caveat: we decided that we would only design metrics that test the FAIRness of a resource from the perspective of a machine The FAIR principles emphasise that data must be FAIR both for humans, and for machines i.e. a machine should be able to replace a human with respect to: ● Discovery of the data of interest ● Discovery of how to access the data (both technological and “contractual”) ● Identification of the data format, and the ability to parse the data into a “sensible” in-memory representation ● Discovery of linked information ● Discovery (and download of) of contextual information relevant to the interpretation of the data ● Discovery of the license associated with that data.
  7. 7. https://tinyurl.com/FAIRMetrics-ALL
  8. 8. F1: (meta) data are assigned globally unique and persistent identifiers
  9. 9. F1: (meta) data are assigned globally unique and persistent identifiers How should we measure this in a way that is clear, realistic, measurable, discriminating, and universal?
  10. 10. Metric Identifier: FM-F1A: https://purl.org/fair-metrics/FM_F1A Metric Name: Identifier Uniqueness To which principle does it apply? F1 What is being measured? Whether there is a scheme to uniquely identify the digital resource. Why should we measure it? The uniqueness of an identifier is a necessary condition to unambiguously refer that resource, and that resource alone. Otherwise, an identifier shared by multiple resources will confound efforts to describe that resource, or to use the identifier to retrieve it. Examples of identifier schemes include, but are not limited to URN, IRI,DOI, Handle, trustyURI, LSID, etc. For an in-depth understanding of the issues around identifiers, please see http://dx.plos.org/10.1371/journal.pbio.2001414
  11. 11. What must be provided? URL to a registered identifier scheme. How do we measure it? An identifier scheme is valid if and only if it is described in a repository that can register and present such identifier schemes (e.g. fairsharing.org). What is a valid result? Present or Absent
  12. 12. Conflates richness with machine- readability → bad metric Join the conversation on GitHub!
  13. 13. Automated Metrics Testing
  14. 14. Caveats Please be gentle in your criticism :-) 1. I am doing this with my own hands… 2. … when I have “free” time (ha!) ... 3. … out of love… I will try to point-out the things on my TODO list as we go through this, but please feel free to give me suggestions!
  15. 15. Desiderata for a FAIR Metrics testing framework 1) All components of the framework should themselves be FAIR 2) Tests should be modular - mirroring (as much as possible) the Metrics themselves 3) Tests should be as objective as possible 4) All stakeholders should be able to define their own Metric Tests 5) All stakeholders should be able to define their own “Evaluations” (combinations of Tests relevant to that stakeholder) 6) The Evaluation system should evolve to accept new standards, without re-coding 7) Anyone should be able to evaluate anything 8) The system should be aspirational - provide feedback for improvement
  16. 16. Architecture Overview
  17. 17. FAIR Evaluator Architecture Overview
  18. 18. FAIR Evaluator Architecture Overview
  19. 19. The FAIR Evaluator Automated, Objective, Aspirational! “Strawman” code by Mark Wilkinson https://tinyurl.com/FAIREvaluator
  20. 20. Overview - Every Metric is associated with a Web-based interface that can evaluate compliance with that Metric - New Metrics can be registered simply by pointing to the URL for their smartAPI - Collection of Metrics can be assembled by anyone, to represent the aspects of FAIRness that they care about (e.g. a journal vs. funding agency vs researcher) - You can execute an evaluation by providing an IRI to be tested, and a collection of Metrics to be applied to it - Evaluations can be recovered (see previous input data) and/or re-executed, either through the Web interface, or by direct connection to the Evaluator from software (i.e. the Web page is only for people)
  21. 21. The Evaluator will soon be peer-reviewed so we are NOT inviting you to try it yourself However, if you must... please don’t “play with” existing evaluations, please create a new one
  22. 22. Homepage
  23. 23. Homepage Browse Metrics
  24. 24. Browse Metrics
  25. 25. Browse Metrics What principle does this Metric test?
  26. 26. Browse Metrics What is the address of the testing interface?
  27. 27. Browse Metrics Browse to one of them….
  28. 28. Interface Definition (Swagger 2.0 with smartAPI extensions)
  29. 29. Homepage Browse Collections of Metrics
  30. 30. Metrics Collections So far I have only collected the individual FAIR facets together Anyone can create a collection (but currently disabled)
  31. 31. Homepage Browse evaluations
  32. 32. Browse Evaluations
  33. 33. Explore one evaluation
  34. 34. Explore one evaluation See the result!
  35. 35. The Result Page
  36. 36. Want to re-execute? See the raw input data
  37. 37. Raw Input Data (copy/edit/paste and HTTP POST to the Evaluator)
  38. 38. Web-based Execution of an Evaluation
  39. 39. Web-based Execution of an Evaluation What is being tested?
  40. 40. Web-based Execution of an Evaluation This questionnaire field is automatically created by examining the Swagger definition for the Metric Tester Swagger Doc.
  41. 41. What happens next? https://fairsharing.org/bsg-s001182 SUBMIT
  42. 42. { "count":7, "previous":null, "results":[ { "pk":3022, "bsg_id":"bsg-s001182", "name":"Digital Object Identifier", "shortname":"DOI", "description":"The digital object identifier (DOI) system originated in a joint initiative of three trade associations in the publishing industry (International Publishers Association; International Association of Scientific, Technical and Medical Publishers; Association of American Publishers). The system was announced at the Frankfurt Book Fair 1997. The International DOI® Foundation (IDF) was created to develop and manage the DOI system, also in 1997. The DOI system was adopted as International Standard ISO 26324 in 2012. The DOI system implements the Handle System and adds a number of new features. The DOI system provides an infrastructure for persistent unique identification of objects of any type. The DOI system is designed to work over the Internet. A DOI name is permanently assigned to an object to provide a resolvable persistent network link to current information about that object, including where the object, or information about it, can be found on the Internet. While information about an object can change over time, its DOI name will not change. A DOI name can be resolved within the DOI system to values of one or more types of data relating to the object identified by that DOI name, such as a URL, an e-mail address, other identifiers and descriptive metadata.rnrnThe DOI system enables the construction of automated services and transactions. Applications of the DOI system include but are not limited to managing information and documentation location and access; managing metadata; facilitating electronic transactions; persistent unique identification of any form of any data; and commercial and non-commercial transactions.rnrnThe content of an object associated with a DOI name is described unambiguously by DOI metadata, based on a structured extensible data model that enables the object to be associated with metadata of any desired degree of precision and granularity to support description and services. The data model supports interoperability between DOI applications.rnrnThe scope of the DOI system is not defined by reference to the type of content (format, etc.) of the referent, but by reference to the functionalities it provides and the context of use. The DOI system provides, within networks of DOI applications, for unique identification, persistence, resolution, metadata and semantic interoperability.rnrn", "homepage":"https://www.doi.org", "taxonomies":[ "Not applicable" ], "domains":[ "Centrally registered identifier", "Knowledge And Information Systems" ], "countries":[ "Worldwide" ], "type":"identifier schema", "doi":null, "maintainers":[ ] }, { "pk":3023, "bsg_id":"bsg-s001183", "name":"Persistent Uniform Resource Locator", "shortname":"PURL", curl -k -i -X GET "https://fairsharing.org/api/standard/summary/?type=identifier%20schema" -H "Accept: application/json" -H "Api-Key: XXXXXXXXXXXXXXXXXXXXX"
  43. 43. Verification of Answer(s) to Questionnaire The Evaluator Metric Tester FAIRSharing.org F1: DOI DOI - OK?
  44. 44. Verification of Answer(s) to Questionnaire The Evaluator Metric Tester FAIRSharing.org Standard! Is this really a DOI?
  45. 45. Verification of Answer(s) to Questionnaire The Evaluator Metric Tester FAIRSharing.org Yes, This is a DOI
  46. 46. Live demo of evaluation
  47. 47. Automated, Objective, Aspirational! It’s HARD to pass the automated tests! (much harder than you would think!) They are unforgiving, because they are automated They force you to think about your interface from the perspective of a machine-visitor
  48. 48. ~~real evaluations - Human vs. Evaluator FM [AID*] Question Dataverse MANUAL Dataverse AUTOMATED Dryad MANUAL Dryad AUTOMATED Zenodo MANUAL Zenodo AUTOMATED Identifier 1 DOI DOI DOI DOI DOI DOI F1A 2 F2 4A,4B F3 5B F4 6A F4 6B A1.1 7A A1.2 8A A1.2 8B A2 9 I1 10 I2 11 I3 12 R1.1 13 R1.2 14A
  49. 49. Not genuine evaluation at this time… pure trust, until all necessary standards are in FAIRSharing FM [AID*] Question Dataverse MANUAL Dataverse AUTOMATED Dryad MANUAL Dryad AUTOMATED Zenodo MANUAL Zenodo AUTOMATED Identifier 1 DOI DOI DOI DOI DOI DOI F1A 2 F2 4A,4B F3 5B F4 6A F4 6B A1.1 7A A1.2 8A A1.2 8B A2 9 I1 10 I2 11 I3 12 R1.1 13 R1.2 14A
  50. 50. Automated, Objective, Aspirational! If you ever start to score 100% on the Metrics… ...we will just create harder Metrics!! (a 100% score on the current Metrics does not mean that you are FAIR) FAIR is ASPIRATIONAL! You can always do better!
  51. 51. TODOs: 1) All messages should be JSON-LD 2) Connect more Metric tests to FAIRSharing as new standards are indexed 3) Deploy LDP server for long-term storage of evaluation results (e.g. to monitor progress over time) 4) Add exemplar data to each Metric Test’s smartAPI (for auto-testing and additional guidance to users) 5) Suggestions…?
  52. 52. Contact: Mark Wilkinson mark.wilkinson@upm.es Ministerio de Economía y Competitividad grant number TIN2014-55993-RM

×