The FAIR Guiding Principles facilitate the Findability, Accessibility, Interoperability, and Reusability of digital resources. The Library of Integrated Network-based Cellular Signatures (LINCS) Project has sought to implement the FAIR principles in the provision of its resources in order to optimize usability. We have surveyed the FAIR principles and are implementing specific facets within the LINCS resources. Subsequently, with reference to the literature and other efforts to measure FAIRness, we are developing quantitative metrics to assess the FAIRness of each dataset and resource in order to provide users with objective measures of the characteristics of the LINCS project. Assessing and improving the FAIRness of LINCS is an ongoing effort by our team that will benefit from community input to ensure that all LINCS users are optimally engaged with this resource.
Module for Grade 9 for Asynchronous/Distance learning
FAIRness Assessment of the Library of Integrated Network-based Cellular Signatures (LINCS) Resources
1. Implementing the FAIR Principles
in the Library of Integrated Network-based
Cellular Signatures (LINCS) Resources
Kathleen Jagodnik, Ph.D.
Ma’ayan Laboratory
Department of Pharmacological Sciences
Icahn School of Medicine at Mount Sinai
New York, New York
BD2K FAIRness Metrics Working Group
June 13, 2017
8. FAIRness Principles
Wilkinson MD et al. (2016) The FAIR Guiding Principles for scientific data management and stewardship.
Scientific Data, 3: 160018.
9. Starr, J., Castro, E., Crosas, M., Dumontier, M., Downs, R. R., Duerr, R., ... &
Clark, T. (2015). Achieving human and machine accessibility of cited data in
scholarly publications. PeerJ Computer Science, 1, e1.
14. FAIRness Guidelines
Wilkinson MD et al. (2016) The FAIR Guiding Principles for scientific data management and stewardship.
Scientific Data, 3: 160018.
15. Assessing FAIRness of LINCS Resources
Wilkinson MD et al. (2016) The FAIR Guiding Principles for scientific data management and stewardship.
Scientific Data, 3: 160018.
LINCS Resources
16. Workflows for Assessing LINCS FAIRness
Reusability Metric R1: meta(data) are richly described with a plurality of accurate and relevant attributes
Sub-Metric R1.1: (meta)data are released with a clear and accessible data usage license
19. 6 participants
3 hours
Discussed a range of open questions related to LINCS FAIRness
Produced a Jupyter Notebook that reports statistics for domains
associated with first-page Google search results for specified queries
DCIC Data Science Symposium Hackathon
20. Jupyter Notebook for Assessing Google Search Results
Query Phrases:
LDP
NPC methotrexate dataset
Valproic Acid dataset MCF7 cells
Cancer cell line chemical perturbation dataset
Imatinib perturbation dataset
Radicolol cell perturbation signature
NPC perturbation
Methotrexate genes MCF7
MCF7 RNAseq
MCF7 L1000
30. Recommended Content for Landing Pages
Starr, J., Castro, E., Crosas, M., Dumontier, M., Downs, R. R., Duerr, R., ... &
Clark, T. (2015). Achieving human and machine accessibility of cited data in
scholarly publications. PeerJ Computer Science, 1, e1.
41. BD2K API Interoperability
Working Group
Co-chairs:
Chunlei Wu Michel Dumontier
cwu@scripps.edu michel.dumontier@maastrictuniversity.nl
Administrators:
Sam Moore Denise Luna
samuel.moore@nih.gov deniseluna@bd2kccc.org
50. Assessed 10 biomedical projects’ licensing info
Sites do not tend to differentiate between data and software
Policies differ widely by resource
Some resources have copyrights, and others don't
~ Some, such as FlyBase, have different copyrights that apply to subsets of resources
Some allow unrestricted use for non-commercial purposes, and require a license for commercial use.
“As-is” disclaimers on some sites
Privacy policies sometimes available; option to opt out
Login typically not required; use of cookies
Licensing Survey
51. 1. The repository has an explicit mission to provide access to and preserve data in its domain.
2. The repository maintains all applicable licenses covering data access and use and monitors compliance.
3. The repository has a continuity plan to ensure ongoing access to and preservation of its holdings.
4. The repository ensures, to the extent possible, that data are created, curated, accessed, and used in compliance with
disciplinary and ethical norms.
5. The repository has adequate funding and sufficient numbers of qualified staff managed through a clear system of
governance to effectively carry out the mission.
6. The repository adopts mechanism(s) to secure ongoing expert guidance and feedback (either in-house, or external,
including scientific guidance, if relevant).
7. The repository guarantees the integrity and authenticity of the data.
8. The repository accepts data and metadata based on defined criteria to ensure relevance and understandability for
data users.
The Core Trustworthy Data Repository Requirements
https://www.datasealofapproval.org/en/information/requirements/
52. 9. The repository applies documented processes and procedures in managing archival storage of the data.
10. The repository assumes responsibility for long-term preservation and manages this function in a planned and
documented way.
11. The repository has appropriate expertise to address technical data and metadata quality and ensures that sufficient
information is available for end users to make quality-related evaluations.
12. Archiving takes place according to defined workflows from ingest to dissemination.
13. The repository enables users to discover the data and refer to them in a persistent way through proper citation.
14. The repository enables reuse of the data over time, ensuring that appropriate metadata are available to support the
understanding and use of the data.
15. The repository functions on well-supported operating systems and other core infrastructural software and is using
hardware and software technologies appropriate to the services it provides to its Designated Community.
16. The technical infrastructure of the repository provides for protection of the facility and its data, products, services,
and users.
The Core Trustworthy Data Repository Requirements
https://www.datasealofapproval.org/en/information/requirements/
53. Beyond addressing repositories, develop standards for datasets & tools
Lists of 25 binary criteria for separately evaluating LINCS datasets & tools
Criteria will be updated annually
Open-source Web-based system
Self-evaluation or independent third-party assessment are possibilities
Developing New Standards for Datasets & Tools
57. FAIRness Assessment
for Example LINCS
Dataset
Key:
Blue: Criterion is satisfied
Red: Criterion is not satisfied
Black: More info is required
to reach a conclusion
58. FAIRness Assessment for Example LINCS Dataset
The dataset is available
in a human-readable format
60. FAIRness Assessment
for Example LINCS
Tool
Key:
Blue: Criterion is satisfied
Red: Criterion is not satisfied
Black: More info is required
to reach a conclusion
61. FAIRness Assessment for Example LINCS Tool
All previous versions of the tool
are made available
66. How to assess the quality of information in an automated manner?
How to clearly differentiate among FAIRness criteria and sub-criteria
~ Will this differ by project?
Do certain criteria precede others?
Open Questions
69. Acknowledgments
Stephan Schurer, Ph.D.
Dusica Vidovic, Ph.D.
Daniel Cooper, Ph.D.
Raymond Terryn, Ph.D.
Caty Chung, M.S.
Vasileios Stathias, B.S.
Ajay Pillai, Ph.D. Avi Ma’ayan, Ph.D.
NIH T32 Training Grant
#4T32HL007824-19
Denis Torre, B.S.
Alexandra Keenan, M.S.
Wen Niu, M.S.
70. References
Dunning A., de Smaele M., Bohmer J. (2017) Are the FAIR Data Principles fair? IDCC17 Practice
Paper. The 12th International Digital Curation Conference, February 20-23, 2017, Edinburgh,
Scotland.
FORCE 11 (2014a). The FAIR Data Principles. FORCE11. Retrieved 18 January 2017, from
https://www.force11.org/group/fairgroup/fairprinciples
FORCE 11 (2014b). Guiding Principles for Findable, Accessible, Interoperable and Re-usable Data
Publishing version b1.0. FORCE11. Retrieved 18 January 2017, from
https://www.force11.org/fairprinciples
H2020 Guidelines on FAIR Data Management:
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-data-
mgt_en.pdf
Sansone, S.-A. et al. (2017) DATS: the data tag suite to enable discoverability of datasets. bioRxiv,
103143.
Starr, J. et al. (2015) Achieving human and machine accessibility of cited data in scholarly
publications. PeerJ Comp Sci, 1, e1.
Wilkinson, M. D. et al. (2016) The FAIR Guiding Principles for scientific data management and
stewardship. Scientific data, 3.
Wilkinson, M. D. et al. (2017) Interoperability and FAIRness through a novel combination of Web
technologies (No. e2522v2). PeerJ Preprints.
71. Images Used
Magnifying glass icon: http://images.clipartpanda.com/magnifying-glass-clipart-biy5E46iL.png
Key icon: https://img.clipartfest.com/7b75f290f4781b7331b6bb477ab7ea69_black-olde-key-clip-art-key-black-and-white-clipart_640-480.svg
Gears icon: https://cdn.shutterstock.com/shutterstock/videos/10879745/thumb/1.jpg
Recycling icon: http://www.recycling.com/wp-content/uploads/recycling%20symbols/black/ Black%20Recycling%20Symbol%20(U+267B).gif
Green pie chart: http://www.psycinsight.co.nz/wp-content/uploads/2015/03/pie-chart-green.jpg
Documentation icon: https://d30y9cdsu7xlg0.cloudfront.net/png/192334-200.png
Missing puzzle pieces: http://www.lshtm.ac.uk/php/departmentofhealthservicesresearchandpolicy/researchareas/economic/
addressingmissingdataincea/puzzle_300.jpg
Vocabulary: http://dev3.ccs.miami.edu:8080/apis/#/datasets
Ontology: https://1.bp.blogspot.com/-gNWHuPpPhDA/T060tS4uk_I/AAAAAAAAqK0/G2Mb69rwMZg/s1600/GO.png
Metadata sphere: https://silwoodtechnology.files.wordpress.com/2013/07/metadata_ball.jpg
Diminishing returns: https://personalexcellence.co/files/graph-diminishing-returns.gif
Documentation: Open book: https://mountainss.files.wordpress.com/2012/09/sysctr-documentation-icon.jpg?w=611
Ontology on blackboard: http://www.emiliosanfilippo.it/wp-content/uploads/2011/11/Ontology.jpg
Data Seal of Approval: http://datasupport.researchdata.nl/uploads/pics/logo_DSA_regulier_120x120_01.jpeg
Diverse users: http://ymedialabs.com/wp-content/uploads/2016/02/target.jpg
Pins in map: https://www.gladd.co.uk/images/images/mystery_location.jpg
Checklist: http://vinciworks.com/blog/wp-content/uploads/2017/03/Data-protection-checklist.png
Bull’s eye of relevance: http://jisushopping.net/B2Bblog/wp-content/uploads/2013/11/JiSu-B2B-Blog-Marketing-Offer-Relevance.jpg
Questions: http://download.4-designer.com/files/20121225/3D-villain-with-a-question-mark-high-quality-pictures-2-34592-thumb.jpg