Are we FAIR yet? And will it be worth it?
The FAIR Principles propose essential characteristics that all digital resources (e.g. datasets, repositories, web services) should possess to be Findable, Accessible, Interoperable, and Reusable by both humans and machines. The Principles act as a guide that researchers and data stewards should expect from contemporary digital resources, and in turn, the requirements on them when publishing their own scholarly products. As interest in, and support for the Principles has spread, the diversity of interpretations has also broadened, with some resources claiming to already “be FAIR”.
This talk will elaborate on what FAIR is, what it entails, and how we should evaluate FAIRness. I will describe new social and technological infrastructure to support the creation and evaluation of FAIR resources, and how FAIR fits into institutional, national and international efforts. Finally, I will discuss the merits of the FAIR principles (and what we ask of people) in the context of strengthening data-driven scientific inquiry.Are we FAIR yet? And will it be worth it?
The FAIR Principles propose essential characteristics that all digital resources (e.g. datasets, repositories, web services) should possess to be Findable, Accessible, Interoperable, and Reusable by both humans and machines. The Principles act as a guide that researchers and data stewards should expect from contemporary digital resources, and in turn, the requirements on them when publishing their own scholarly products. As interest in, and support for the Principles has spread, the diversity of interpretations has also broadened, with some resources claiming to already “be FAIR”.
This talk will elaborate on what FAIR is, what it entails, and how we should evaluate FAIRness. I will describe new social and technological infrastructure to support the creation and evaluation of FAIR resources, and how FAIR fits into institutional, national and international efforts. Finally, I will discuss the merits of the FAIR principles (and what we ask of people) in the context of strengthening data-driven scientific inquiry.
Keynote given at NETTAB2018 - http://www.igst.it/nettab/2018/
The FAIR Principles propose key characteristics that all digital resources (e.g. datasets, repositories, web services) should possess to be Findable, Accessible, Interoperable, and Reusable by people and machines. The Principles act as a guide that researchers should expect from contemporary digital resources, and in turn, the requirements on them when publishing their own scholarly products. As interest in, and support for the Principles has spread, the diversity of interpretations has also broadened, with some resources claiming to already “be FAIR”. This talk will elaborate on what FAIR is, why we need it, what it entails, and how we should evaluate FAIRness. I will describe new social and technological infrastructure to support the creation and evaluation of FAIR resources, and how FAIR fits into institutional, national and international efforts. Finally, I will discuss the merits of the FAIR principles (and what we ask of people) in the context of strengthening data-driven scientific inquiry.
The Role of the FAIR Guiding Principles for an effective Learning Health SystemMichel Dumontier
he learning health system (LHS) is an integrated social and technological system that embeds continuous improvement and innovation for the effective delivery of healthcare. A crucial part of the LHS lies in how the underlying information system will secure and take advantage of relevant knowledge assets towards supporting complex and unusual clinical decision making, facilitating public health surveillance, and aiding comparative effectiveness research. However, key knowledge assets remain difficult to obtain and reuse, particularly in a decentralized context. In this talk, I will discuss the role of the Findable, Accessible, Interoperable, and Reusable (FAIR) Guiding Principles towards the realization of the LHS, along with emerging technologies to publish and refine clinical research and knowledge derived therein.
Keynote given for 2021 Knowledge Representation for Health Care http://banzai-deim.urv.net/events/KR4HC-2021/
The role of the FAIR Guiding Principles in a Learning Health SystemMichel Dumontier
The learning health system (LHS) is a concept for a socio-technological system that continuously improves the delivery of health care by coupling biomedical research with practice- and evidence- based medicine. Key aspects of the LHS are collecting, integrating, and analyzing data from different sources. While the increased digitalisation of healthcare is creating new data sources, these remain hard to find and use, let alone make use of as part of intelligent systems for the benefit of patients, healthcare providers, and researchers. This talk will examine recent developments towards making key parts of the LHS, such as clinical practice guidelines, Findable, Accessible, Interoperable, and Reusable (FAIR).
The future of science and business - a UM Star LectureMichel Dumontier
I discuss how data science is affecting our way of life and how we at Maastricht University are preparing the next generation of leaders to address opportunities and challenges in responsible manner.
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Michel Dumontier
ith its focus on improving the health and well being of people, biomedicine has always been a fertile, if not challenging domain for computational discovery science. Indeed, the existence of millions of scientific articles, thousands of databases, and hundreds of ontologies, offer exciting opportunities to reuse our collective knowledge, were we not stymied by incompatible formats, overlapping and incomplete vocabularies, unclear licensing, and heterogeneous access points. In this talk, I will discuss our work to create computational standards, platforms, and methods to wrangle knowledge into simple, but effective representations based on semantic web technologies that are maximally FAIR - Findable, Accessible, Interoperable, and Reuseable - and to further use these for biomedical knowledge discovery. But only with additional crucial developments will this emerging Internet of FAIR data and services enable automated scientific discovery on a global scale.
The FAIR Principles propose key characteristics that all digital resources (e.g. datasets, repositories, web services) should possess to be Findable, Accessible, Interoperable, and Reusable by people and machines. The Principles act as a guide that researchers should expect from contemporary digital resources, and in turn, the requirements on them when publishing their own scholarly products. As interest in, and support for the Principles has spread, the diversity of interpretations has also broadened, with some resources claiming to already “be FAIR”. This talk will elaborate on what FAIR is, why we need it, what it entails, and how we should evaluate FAIRness. I will describe new social and technological infrastructure to support the creation and evaluation of FAIR resources, and how FAIR fits into institutional, national and international efforts. Finally, I will discuss the merits of the FAIR principles (and what we ask of people) in the context of strengthening data-driven scientific inquiry.
The Role of the FAIR Guiding Principles for an effective Learning Health SystemMichel Dumontier
he learning health system (LHS) is an integrated social and technological system that embeds continuous improvement and innovation for the effective delivery of healthcare. A crucial part of the LHS lies in how the underlying information system will secure and take advantage of relevant knowledge assets towards supporting complex and unusual clinical decision making, facilitating public health surveillance, and aiding comparative effectiveness research. However, key knowledge assets remain difficult to obtain and reuse, particularly in a decentralized context. In this talk, I will discuss the role of the Findable, Accessible, Interoperable, and Reusable (FAIR) Guiding Principles towards the realization of the LHS, along with emerging technologies to publish and refine clinical research and knowledge derived therein.
Keynote given for 2021 Knowledge Representation for Health Care http://banzai-deim.urv.net/events/KR4HC-2021/
The role of the FAIR Guiding Principles in a Learning Health SystemMichel Dumontier
The learning health system (LHS) is a concept for a socio-technological system that continuously improves the delivery of health care by coupling biomedical research with practice- and evidence- based medicine. Key aspects of the LHS are collecting, integrating, and analyzing data from different sources. While the increased digitalisation of healthcare is creating new data sources, these remain hard to find and use, let alone make use of as part of intelligent systems for the benefit of patients, healthcare providers, and researchers. This talk will examine recent developments towards making key parts of the LHS, such as clinical practice guidelines, Findable, Accessible, Interoperable, and Reusable (FAIR).
The future of science and business - a UM Star LectureMichel Dumontier
I discuss how data science is affecting our way of life and how we at Maastricht University are preparing the next generation of leaders to address opportunities and challenges in responsible manner.
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Michel Dumontier
ith its focus on improving the health and well being of people, biomedicine has always been a fertile, if not challenging domain for computational discovery science. Indeed, the existence of millions of scientific articles, thousands of databases, and hundreds of ontologies, offer exciting opportunities to reuse our collective knowledge, were we not stymied by incompatible formats, overlapping and incomplete vocabularies, unclear licensing, and heterogeneous access points. In this talk, I will discuss our work to create computational standards, platforms, and methods to wrangle knowledge into simple, but effective representations based on semantic web technologies that are maximally FAIR - Findable, Accessible, Interoperable, and Reuseable - and to further use these for biomedical knowledge discovery. But only with additional crucial developments will this emerging Internet of FAIR data and services enable automated scientific discovery on a global scale.
Identifying Drug Interaction Candidates in Real-World DataNeo4j
Speakers: Kathleen Mandziuk, Vice President, Patient Strategy and Digital Health, PRA HealthSciences
Nathan Smith, Senior Principal Data Scientist, PRA HealthSciences
Kerry Deem, Associate Director, Programming, PRA HealthSciences
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...apidays
apidays LIVE Australia 2021 - Accelerating Digital
September 15 & 16, 2021
Locknote: APIs enable global collaborations and accelerate health and medical research
Dr. Denis Bauer, Head Cloud Computing Bioinformatics at CSIRO
Blockchain and Patient-Centered Outcomes Measures - GoldwaterSean Manion PhD
Blockchain in Health Research 2019 was the 2nd annual summit hosted at Georgetown University on 27 Apr 2019 by Sean Manion, Science Distributed and Gilles Hilary, Georgetown University.
Acclerating biomedical discovery with an internet of FAIR data and services -...Michel Dumontier
With its focus on improving the health and well being of people, biomedicine has always been a fertile, if not challenging domain for computational discovery science. Indeed, the existence of millions of scientific articles, thousands of databases, and hundreds of ontologies, offer exciting opportunities to reuse our collective knowledge, were we not stymied by incompatible formats, overlapping and incomplete vocabularies, unclear licensing, and heterogeneous access points. In this talk, I will discuss our work to create computational standards, platforms, and methods to wrangle knowledge into simple, but effective representations based on semantic web technologies that are maximally FAIR - Findable, Accessible, Interoperable, and Reuseable - and to further use these for biomedical knowledge discovery. But only with additional crucial developments will this emerging Internet of FAIR data and services, which is built on Semantic Web technologies, be well positioned to support automated scientific discovery on a global scale.
This presentation was provided by Keri Mattaliano and Ray Gilmartin of Copyright Clearance Center, during the NISO event "Transforming Search: What the Information Community Can and Should Build." The virtual conference was held on August 26, 2020.
This presentation was provided by Markus Kaindl of Springer Nature, during the NISO event "Transforming Search: What the Information Community Can and Should Build." The virtual conference was held on August 26, 2020.
Blockchain Healthcare Situation Report (BC/HC SITREP) Volume 2 Issue 19, 07 - 13 May 2018. A weekly newsletter curating news and events relating to blockchain and healthcare by Sean Manion, CEO of Science Distributed.
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...US-Ignite
Robert Grossman –University of Chicago
Joe Mambretti–Northwestern University
Piers Nash –University of Chicago
Jim Chen –Northwestern University
Allison Heath –University of Chicago
Big data for healthcare analytics final -v0.3 mizYusuf Brima
Sources of Big Data in Health (a comparative description of national and international data sources and identification of new/emerging sources of data)
How much is that data in the window : Healthcare data valuationSean Manion PhD
Presentation on healthcare data valuation, data confidence fabrics, layers of trust in healthcare, and health data marketplaces as part of the Health Data Valuation event, Session 10 of the IEEE Healthcare: Blockchain & AI Virtual Series on 25 August 2021
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...Michel Dumontier
Biomedicine has always been a fertile and challenging domain for computational discovery science. Indeed, the existence of millions of scientific articles, thousands of databases, and hundreds of ontologies, offer exciting opportunities to reuse our collective knowledge, were we not stymied by incompatible formats, overlapping and incomplete vocabularies, unclear licensing, and heterogeneous access points. In this talk, I will discuss our work to create computational standards, platforms, and methods to wrangle knowledge into simple, but effective representations based on semantic web technologies that are maximally FAIR - Findable, Accessible, Interoperable, and Reuseable - and to further use these for biomedical knowledge discovery. But only with additional crucial developments will this emerging Internet of FAIR data and services enable automated scientific discovery on a global scale.
bio:
Dr. Michel Dumontier is the Distinguished Professor of Data Science at Maastricht University and co-founder of the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. His research focuses on the development of computational methods for scalable and responsible discovery science. Dr. Dumontier obtained his BSc (Biochemistry) in 1998 from the University of Manitoba, and his PhD (Bioinformatics) in 2005 from the University of Toronto. Previously a faculty member at Carleton University in Ottawa and Stanford University in Palo Alto, Dr. Dumontier founded and directs the interfaculty Institute of Data Science at Maastricht University to develop sociotechnological systems for responsible data science by design. His work is supported through the Dutch National Research Agenda, the Netherlands Organisation for Scientific Research, Horizon 2020, the European Open Science Cloud, the US National Institutes of Health and a Marie-Curie Innovative Training Network. He is the editor-in-chief for the journal Data Science and is internationally recognized for his contributions in bioinformatics, biomedical informatics, and semantic technologies including ontologies and linked data.
This presentation was given on October 21, 2020 at CIKM2020.
Identifying Drug Interaction Candidates in Real-World DataNeo4j
Speakers: Kathleen Mandziuk, Vice President, Patient Strategy and Digital Health, PRA HealthSciences
Nathan Smith, Senior Principal Data Scientist, PRA HealthSciences
Kerry Deem, Associate Director, Programming, PRA HealthSciences
apidays LIVE Australia 2021 - APIs enable global collaborations and accelerat...apidays
apidays LIVE Australia 2021 - Accelerating Digital
September 15 & 16, 2021
Locknote: APIs enable global collaborations and accelerate health and medical research
Dr. Denis Bauer, Head Cloud Computing Bioinformatics at CSIRO
Blockchain and Patient-Centered Outcomes Measures - GoldwaterSean Manion PhD
Blockchain in Health Research 2019 was the 2nd annual summit hosted at Georgetown University on 27 Apr 2019 by Sean Manion, Science Distributed and Gilles Hilary, Georgetown University.
Acclerating biomedical discovery with an internet of FAIR data and services -...Michel Dumontier
With its focus on improving the health and well being of people, biomedicine has always been a fertile, if not challenging domain for computational discovery science. Indeed, the existence of millions of scientific articles, thousands of databases, and hundreds of ontologies, offer exciting opportunities to reuse our collective knowledge, were we not stymied by incompatible formats, overlapping and incomplete vocabularies, unclear licensing, and heterogeneous access points. In this talk, I will discuss our work to create computational standards, platforms, and methods to wrangle knowledge into simple, but effective representations based on semantic web technologies that are maximally FAIR - Findable, Accessible, Interoperable, and Reuseable - and to further use these for biomedical knowledge discovery. But only with additional crucial developments will this emerging Internet of FAIR data and services, which is built on Semantic Web technologies, be well positioned to support automated scientific discovery on a global scale.
This presentation was provided by Keri Mattaliano and Ray Gilmartin of Copyright Clearance Center, during the NISO event "Transforming Search: What the Information Community Can and Should Build." The virtual conference was held on August 26, 2020.
This presentation was provided by Markus Kaindl of Springer Nature, during the NISO event "Transforming Search: What the Information Community Can and Should Build." The virtual conference was held on August 26, 2020.
Blockchain Healthcare Situation Report (BC/HC SITREP) Volume 2 Issue 19, 07 - 13 May 2018. A weekly newsletter curating news and events relating to blockchain and healthcare by Sean Manion, CEO of Science Distributed.
Prototype SDX Bioinformatics Exchange: Demonstrating an Essential Use-Case fo...US-Ignite
Robert Grossman –University of Chicago
Joe Mambretti–Northwestern University
Piers Nash –University of Chicago
Jim Chen –Northwestern University
Allison Heath –University of Chicago
Big data for healthcare analytics final -v0.3 mizYusuf Brima
Sources of Big Data in Health (a comparative description of national and international data sources and identification of new/emerging sources of data)
How much is that data in the window : Healthcare data valuationSean Manion PhD
Presentation on healthcare data valuation, data confidence fabrics, layers of trust in healthcare, and health data marketplaces as part of the Health Data Valuation event, Session 10 of the IEEE Healthcare: Blockchain & AI Virtual Series on 25 August 2021
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...Michel Dumontier
Biomedicine has always been a fertile and challenging domain for computational discovery science. Indeed, the existence of millions of scientific articles, thousands of databases, and hundreds of ontologies, offer exciting opportunities to reuse our collective knowledge, were we not stymied by incompatible formats, overlapping and incomplete vocabularies, unclear licensing, and heterogeneous access points. In this talk, I will discuss our work to create computational standards, platforms, and methods to wrangle knowledge into simple, but effective representations based on semantic web technologies that are maximally FAIR - Findable, Accessible, Interoperable, and Reuseable - and to further use these for biomedical knowledge discovery. But only with additional crucial developments will this emerging Internet of FAIR data and services enable automated scientific discovery on a global scale.
bio:
Dr. Michel Dumontier is the Distinguished Professor of Data Science at Maastricht University and co-founder of the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. His research focuses on the development of computational methods for scalable and responsible discovery science. Dr. Dumontier obtained his BSc (Biochemistry) in 1998 from the University of Manitoba, and his PhD (Bioinformatics) in 2005 from the University of Toronto. Previously a faculty member at Carleton University in Ottawa and Stanford University in Palo Alto, Dr. Dumontier founded and directs the interfaculty Institute of Data Science at Maastricht University to develop sociotechnological systems for responsible data science by design. His work is supported through the Dutch National Research Agenda, the Netherlands Organisation for Scientific Research, Horizon 2020, the European Open Science Cloud, the US National Institutes of Health and a Marie-Curie Innovative Training Network. He is the editor-in-chief for the journal Data Science and is internationally recognized for his contributions in bioinformatics, biomedical informatics, and semantic technologies including ontologies and linked data.
This presentation was given on October 21, 2020 at CIKM2020.
ISC2 Privacy-Preserving Analytics and Secure Multiparty ComputationUlfMattsson7
Use Cases in Machine learning (ML)
Secure Multi-Party Computation (SMPC)
Homomorphic encryption (HE)
Differential Privacy (DP) and K-Anonymity
Pseudonymization and Anonymization
Synthetic Data
Zero trust architecture (ZTA)
Zero-knowledge proofs (ZKP)
Private Set Intersection (PSI)
Trusted execution environments (TEE)
Post-Quantum Cryptography
Regulations and Standards in Data Privacy
E-marketing is a process of planning and executing the conception, distribution, promotion, and pricing of products and services in a computerized, networked environment, such as the Internet and the World Wide Web, to facilitate exchanges and satisfy customer demands.
Can blockchain technology be the answer to IoT and AI security for Industry 4.0? Industrial Security Forum - The Secure Path of the Digital Future - Presentation at the Hannover Messe Industrie (HMI), Germany in April 2018
I presented this at the launch event for the DRIVA project at the University of Brighton on 18 March 2019. Link: https://www.brighton.ac.uk/about-us/news-and-events/news/2019/03-18-creative-big-data-project-launched.aspx
This presentation explores how the blockchain ecosystem is developing to support a vibrant data economy,. We look at issues of why data quality matters, how AI needs trusted data, and how massive investment is coming into the blockchain-powered data economy. We also look at key ways blockchain is enabling innovation in the consumer data economy.. We examine how two major tech companies are taking action in blockchain, and suggest things that any company can do now.
Community of practice on socio-economic dataIFPRI-PIM
This presentation was given virtually by Gideon Kruseman (CIMMYT), as part of the Capacity Development Workshop hosted by the CGIAR Collaborative Platform for Gender Research. The event took place on 7-8 December 2017 in Amsterdam, the Netherlands, where the Platform is hosted (by KIT Royal Tropical Institute).
Read more: http://gender.cgiar.org/gender_events/annual-scientific-conference-capacity-development-workshop-cgiar-collaborative-platform-gender-research/
This presentation was given virtually by Gideon Kruseman (CIMMYT), as part of the Capacity Development Workshop hosted by the CGIAR Collaborative Platform for Gender Research. The event took place on 7-8 December 2017 in Amsterdam, the Netherlands, where the Platform is hosted (by KIT Royal Tropical Institute).
Read more: http://gender.cgiar.org/gender_events/annual-scientific-conference-capacity-development-workshop-cgiar-collaborative-platform-gender-research/
Protecting data privacy in analytics and machine learning ISACA London UKUlf Mattsson
ISACA London Chapter webinar, Feb 16th 2021
Topic: “Protecting Data Privacy in Analytics and Machine Learning”
Abstract:
In this session, we will discuss a range of new emerging technologies for privacy and confidentiality in machine learning and data analytics. We will discuss how to put these technologies to work for databases and other data sources.
When we think about developing AI responsibly, there’s many different activities that we need to think about.
This session also discusses international standards and emerging privacy-enhanced computation techniques, secure multiparty computation, zero trust, cloud and trusted execution environments. We will discuss the “why, what, and how” of techniques for privacy preserving computing.
We will review how different industries are taking opportunity of these privacy preserving techniques. A retail company used secure multi-party computation to be able to respect user privacy and specific regulations and allow the retailer to gain insights while protecting the organization’s IP. Secure data-sharing is used by a healthcare organization to protect the privacy of individuals and they also store and search on encrypted medical data in cloud.
We will also review the benefits of secure data-sharing for financial institutions including a large bank that wanted to broaden access to its data lake without compromising data privacy but preserving the data’s analytical quality for machine learning purposes.
-Enrichment - Unlocking the value of data for digital transformation - Big Da...webwinkelvakdag
As pressure for digital transformation increases, companies must harness big data more effectively. But the well-known V’s of data—volume, variety, velocity—represent both opportunities and challenges. Data enrichment enables organizations to take full advantage of the benefits while addressing these typical problems. In this session, we look at what an enrichment workflow might look like and how it enhances data’s value across different use cases.
D2 d turning information into a competive asset - 23 jan 2014Henk van Roekel
Understanding the evolution of Business Intelligence and Analytics and the challenges and opportunities that come with it. Exploring CGI's Data2Diamonds™ approach ensuring financial sound, technical viable and socially desirable Big Data initiatives.
Connected barrels_IoT in Oil and Gas_deloitteAnshu Mittal
In the oil and gas industry, the promise of IoT applications lies not with managing existing assets, supply chains, or customer relationships but, rather, in creating new value in information about these. An integrated deployment strategy is key for O&G companies looking to find value in IoT technology.
In the oil and gas industry, the promise of IoT applications lies not with managing existing assets, supply chains, or customer relationships but, rather, in creating new value in information about these. An integrated deployment strategy is key for O&G companies looking to find value in IoT technology.
Similar to Are we FAIR yet? And will it be worth it? (20)
The increased availability of biomedical data, particularly in the public domain, offers the opportunity to better understand human health and to develop effective therapeutics for a wide range of unmet medical needs. However, data scientists remain stymied by the fact that data remain hard to find and to productively reuse because data and their metadata i) are wholly inaccessible, ii) are in non-standard or incompatible representations, iii) do not conform to community standards, and iv) have unclear or highly restricted terms and conditions that preclude legitimate reuse. These limitations require a rethink on data can be made machine and AI-ready - the key motivation behind the FAIR Guiding Principles. Concurrently, while recent efforts have explored the use of deep learning to fuse disparate data into predictive models for a wide range of biomedical applications, these models often fail even when the correct answer is already known, and fail to explain individual predictions in terms that data scientists can appreciate. These limitations suggest that new methods to produce practical artificial intelligence are still needed.
In this talk, I will discuss our work in (1) building an integrative knowledge infrastructure to prepare FAIR and "AI-ready" data and services along with (2) neurosymbolic AI methods to improve the quality of predictions and to generate plausible explanations. Attention is given to standards, platforms, and methods to wrangle knowledge into simple, but effective semantic and latent representations, and to make these available into standards-compliant and discoverable interfaces that can be used in model building, validation, and explanation. Our work, and those of others in the field, creates a baseline for building trustworthy and easy to deploy AI models in biomedicine.
Bio
Dr. Michel Dumontier is the Distinguished Professor of Data Science at Maastricht University, founder and executive director of the Institute of Data Science, and co-founder of the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. His research explores socio-technological approaches for responsible discovery science, which includes collaborative multi-modal knowledge graphs, privacy-preserving distributed data mining, and AI methods for drug discovery and personalized medicine. His work is supported through the Dutch National Research Agenda, the Netherlands Organisation for Scientific Research, Horizon Europe, the European Open Science Cloud, the US National Institutes of Health, and a Marie-Curie Innovative Training Network. He is the editor-in-chief for the journal Data Science and is internationally recognized for his contributions in bioinformatics, biomedical informatics, and semantic technologies including ontologies and linked data.
knowledge graphs are an emerging paradigm to represent information. yet their discovery and reuse is hampered by insufficient or inadequate metadata. here, the COST ACTION Distributed Knowledge Graphs had a first workshop to develop a KG metadata schema. In this presentation, the progress and plans are discussed with the W3C Community Group on Knowledge Graph Construction.
Data-Driven Discovery Science with FAIR Knowledge GraphsMichel Dumontier
Data-Driven Discovery Science with FAIR Knowledge Graphs
Despite the existence of vast amounts of biomedical data, these remain difficult to find and to productively reuse in machine learning and other Artificial Intelligence technologies. In this talk, I will discuss the role of the FAIR Guiding Principles to make AI-ready biomedical data, and their representation as knowledge graphs not only enables powerful ontology-backed semantic queries, but also can be used to predict missing information, as well as to check the quality of knowledge collected.
The main idea of the talk is to introduce the FAIR principles (what they are and what they are not), and how their application with semantic web technologies (ontologies/linked data) creates improved possibilities for large scale data integration, answering sophisticated questions using automated reasoners, and predicting new relations/validating data using graph embeddings. The audience will gain insight into the state of the art in a carefully presented manner that introduces principles, approaches, and outcomes relevant to Health AI.
The FAIR (Findable, Accessible, Interoperable, Reusable) Guiding Principles light a path towards improving the discovery and reuse of digital objects (data, documents, software, web services, etc) by machines. Machine reusability is a crucial strategic component in building robust digital infrastructure that strengthens scholarship and opens new pathways for innovation on a truly global scale. However, as the FAIR principles do not specify any particular implementation, communities have the homework to devise, standardize and implement technical specifications to improve the ‘FAIRness’ of digital assets. In this seminar, I will focus on the history and state of the art in the FAIRness assessment, including manual, semi-automated and fully automated approaches, and how these can be used by developers and consumers alike. This seminar will serve as a springboard for community discussion and adoption of these services to incrementally and realistically improve the FAIRness of their resources.
A talk prepared for Workshop Working on data stewardship? Meet your peers!
Datum: 03 OKT 2017
https://www.surf.nl/agenda/2017/10/workshop-working-on-data-stewardship-meet-your-peers/index.html
Towards metrics to assess and encourage FAIRnessMichel Dumontier
With an increased interest in the FAIR metrics, there is need to develop tools and appraoches that can assess the FAIRness of a digital resource. This talk begins to explore some ideas in this space, and invites people to participate in a working group focused on the development, application, and evaluation of FAIR metric efforts.
A presentation to the New Year's Event for Maastricht University's Knowledge Engineering @ Work Program. https://www.maastrichtuniversity.nl/news/kework-first-10-students-academic-workstudy-track-graduate
Bio2RDF is an open-source project that offers a large and
connected knowledge graph of Life Science Linked Data. Each dataset is expressed using its own vocabulary, thereby hindering integration, search, query, and browse data across similar or identical types of data. With growth and content changes in source data, a manual approach to maintain mappings has proven untenable. The aim of this work is to develop a (semi)automated procedure to generate high quality mappings
between Bio2RDF and SIO using BioPortal ontologies. Our preliminary results demonstrate that our approach is promising in that it can find new mappings using a transitive closure between ontology mappings. Further development of the methodology coupled with improvements in
the ontology will offer a better-integrated view of the Life Science Linked Data
Ontology has its roots as a field of philosophical study that is focused on the nature of existence. However, today's ontology (aka knowledge graph) can incorporate computable descriptions that can bring insight in a wide set of compelling applications including more precise knowledge capture, semantic data integration, sophisticated query answering, and powerful association mining - thereby delivering key value for health care and the life sciences. In this webinar, I will introduce the idea of computable ontologies and describe how they can be used with automated reasoners to perform classification, to reveal inconsistencies, and to precisely answer questions. Participants will learn about the tools of the trade to design, find, and reuse ontologies. Finally, I will discuss applications of ontologies in the fields of diagnosis and drug discovery.
Bio:
Dr. Michel Dumontier is an Associate Professor of Medicine (Biomedical Informatics) at Stanford University. His research focuses on the development of methods to integrate, mine, and make sense of large, complex, and heterogeneous biological and biomedical data. His current research interests include (1) using genetic, proteomic, and phenotypic data to find new uses for existing drugs, (2) elucidating the mechanism of single and multi-drug side effects, and (3) finding and optimizing combination drug therapies. Dr. Dumontier is the Stanford University Advisory Committee Representative for the World Wide Web Consortium, the co-Chair for the W3C Semantic Web for Health Care and the Life Sciences Interest Group, scientific advisor for the EBI-EMBL Chemistry Services Division, and the Scientific Director for Bio2RDF, an open source project to create Linked Data for the Life Sciences. He is also the founder and Editor-in-Chief for a Data Science, a new IOS Press journal featuring open access, open review, and semantic publishing.
Building a Network of Interoperable and Independently Produced Linked and Ope...Michel Dumontier
Over 15 years ago, Sir Tim Berners Lee proclaimed the founding of an exciting new future involving intelligent agents operating over smarter data in order to perform complex tasks at the behest of their human controllers. At the heart of this vision lies an uneasy alliance between tedious formal knowledge representations and powerful analytics over big, but often messy data. Bio2RDF, our decade old open source project to create Linked Data for the life sciences, has weaved emergent Semantic Web technologies such as ontologies and Linked Data to generate FAIR - Findable, Accessible, Interoperable, and Reusable - data in the form of billions of machine accessible statements for use in downstream biomedical discovery.
This revolution in data publication has been strengthened by action from global bioinformatics institutions such as the NCBI, NCBO, EBI, and DBCLS. Notably, NCBI's PubChem has successfully coupled large scale data integration with community-based standards to offer a remakable biochemical knowledge resource amenable to data hungry discovery tools. Yet, in the face of increasing pressure from researchers, funders, and publishers, will these approaches be sufficient for growing and maintaining a comprehensive knowledge graph that is inclusive of all biomedical research?
Model organisms such as budding yeast provide a common platform to interrogate and understand cellular and physiological processes. Knowledge about model organisms, whether generated during the course of scientific investigation, or extracted from published articles, are made available by model organism databases (MODs) such as the Saccharomyces Genome Database (SGD) for powerful, data-driven bioinformatic analyses. Integrative platforms such as InterMine offer a standard platform for MOD data exploration and data mining. Yet, today’s bioinformatic analyses also requires access to a significantly broader set of structured biomedical data, such as what can be found in the emerging network of Linked Open Data (LOD). If MOD data could be provisioned as FAIR (Findable, Accessible, Interoperable, and Reusable), then scientists could leverage a greater amount of interoperable data in knowledge discovery.
The goal of this proposal is to increase the utility of MOD data by implementing standards-compliant data access interfaces that interoperate with Linked Data. We will focus our efforts on developing interfaces for data access, data retrieval, and query answering for SGD. Our software will publish InterMine data as LOD that are semantically annotated with ontologies and be retrieved using standardized formats (e.g. JSON-LD, Turtle). We will facilitate the exploration of MOD data for hypothesis testing, by implementing efficient query answering using Linked Data Fragments, and by developing a set of graphical user interfaces to search for data of interest, explore connections, and answer questions that leverage the wider LOD network. Finally, we will develop a locally and cloud-deployable image to enable the rapid deployment of the proposed infrastructure. Our efforts to increase interoperability and ease of deployment for biomedical data repositories will increase research productivity and reduce costs associated with data integration and warehouse maintenance.
Making it Easier, Possibly Even Pleasant, to Author Rich Experimental MetadataMichel Dumontier
Biomedical researchers will remain stymied in their ability to take full advantage of the Big Data revolution if they can never find the datasets that they need to analyze, if there is lack of clarity about what particular datasets contain, and if data are insufficiently described.
CEDAR, an NIH BD2K Center of Excellence, aims to develop methods and tools to vastly ease the burden of authoring good experimental metadata, and to maximally use this information to zero in on datasets of interest.
Semantic web technologies offer a potential mechanism for the representation and integration of thousands of biomedical databases. Many of these databases offer cross-references to other data sources, but these are generally incomplete and prone to error. In this paper, we conduct an empirical analysis of the link structure of life science Linked Data, obtained from the Bio2RDF project. Three different link graphs for datasets, entities and terms are characterized by degree, connectivity, and clustering metrics, and their correlation is measured as well. Furthermore, we utilize the symmetry and transitivity of entity links to build a benchmark and evaluate several popular entity matching approaches. Our findings indicate that the life science data network can help find hidden links, can be used to validate links, and may offer a mechanism to integrate a wider set of resources to support biomedical knowledge discovery.
Making the most of phenotypes in ontology-based biomedical knowledge discoveryMichel Dumontier
A phenotype is an observable characteristic of an individually and typically pertains to its morphology, function, and behavior. Phenotypes, whether observed at the bench or the bedside, are increasingly being used to gain insight into the diagnosis, mechanism, and treatment of disease. A key aspect of these approaches involve comparing phenotypes that are defined in multiple terminologies that often cater to altogether different organisms, such as mice and humans. In this seminar, I will discuss computational approaches for harmonizing and utilizing phenotypes for translational research. We will examine case studies which involve the computation of semantic similarity including the use of phenotypes to inform clinical diagnosis of rare diseases, to identify human drug targets using mice knock-out models, and to explore phenotype-based approaches for drug repositioning .
Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. This document describes a consensus among participating stakeholders in the Health Care and the Life Sciences domain on the description of datasets using the Resource Description Framework (RDF). This specification meets key functional requirements, reuses existing vocabularies to the extent that it is possible, and addresses elements of data description, versioning, provenance, discovery, exchange, query, and retrieval.
With its focus on investigating the basis for the sustained existence
of living systems, modern biology has always been a fertile, if not
challenging, domain for formal knowledge representation and automated
reasoning. With thousands of databases and hundreds of ontologies now
available, there is a salient opportunity to integrate these for
discovery. In this talk, I will discuss our efforts to build a rich
foundational network of ontology-annotated linked data, develop
methods to intelligently retrieve content of interest, uncover
significant biological associations, and pursue new avenues for drug
discovery. As the portfolio of Semantic Web technologies continue to
mature in terms of functionality, scalability, and an understanding of
how to maximize their value, researchers will be strategically poised
to pursue increasingly sophisticated KR projects aimed at improving
our overall understanding of human health and disease.
bio: Dr. Michel Dumontier is an Associate Professor of Medicine
(Biomedical Informatics) at Stanford University. His research aims to
find new treatments for rare and complex diseases. His research
interest lie in the publication, integration, and discovery of
scientific knowledge. Dr. Dumontier serves as a co-chair for the World
Wide Web Consortium Semantic Web in Health Care and Life Sciences
Interest Group (W3C HCLSIG) and is the Scientific Director for
Bio2RDF, a widely used open-source project to create and provide
linked data for life sciences.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
1. Are we FAIR yet? And will it be worth it?
@micheldumontier::NETTAB:2018-10-221
Michel Dumontier, Ph.D.
Distinguished Professor of Data Science
Director, Institute of Data Science
2. An increasing number of
discoveries are made using other
people’s data
@micheldumontier::NETTAB:2018-10-222
3. 3
A common rejection module (CRM) for acute rejection across multiple organs identifies novel
therapeutics for organ transplantation
Khatri et al. JEM. 210 (11): 2205
DOI: 10.1084/jem.20122709
@micheldumontier::NETTAB:2018-10-22
Main Findings:
1. CRM genes correlated with the extent of graft injury and predicted future injury to a graft
2. Mice treated with drugs against the CRM genes extended graft survival
4. However, significant effort was
needed to find the right datasets,
make sense of them, and ultimately
use them for a new purpose
@micheldumontier::NETTAB:2018-10-224
6. If we are ever to realize the full
potential of content we create
then we must find ways to reduce the
barrier to publish digital content in a
way that makes it vastly easier to
find, assess and reuse
@micheldumontier::NETTAB:2018-10-226
8. Why does this matter?
@micheldumontier::NETTAB:2018-10-228
9. 9 @micheldumontier::NETTAB:2018-10-22
Most published research findings are false.
- John Ioannidis, Stanford University
Reproducibility of landmark studies is shockingly low:
39% (39/100) in psychology1
21% (14/67) in pharmacology2
11% (6/53) in cancer3
PLoS Med 2005;2(8): e124.
1doi:10.1038/nature.2015.17433 2doi:10.1038/nrd3439-c1 3doi:10.1038/483531a
11. @micheldumontier::NETTAB:2018-10-2211
we need new ways to think about
discovery science
We need to improve
our confidence in any result
by using more data
and with support
from multiple lines of evidence
13. We must build a social, ethical and
technological infrastructure that
facilitates the discovery and reuse
of digital resources
for people and machines
@micheldumontier::NETTAB:2018-10-2213
14. Why machines?
• Can gather and make sense of vast amounts of information to
better understand the world and make more effective
decisions
@micheldumontier::NETTAB:2018-10-2214
15. Big Data
for Medicine
@micheldumontier::NETTAB:2018-10-2215
Multiple sources of heterogeneous
data, including experimental evidence,
bioinformatics databases, lifestyle
measurements, electronic health
records, environmental influences, and
biobank findings, can be combined
using machine learning algorithms to
identify causal disease networks,
stratify patients, and predict more
efficacious therapies.
16. Why machines?
• Can make sense of vast amounts of information to make
personalized, evidence-based decisions to maximize desired
outcomes
• Can create detailed workflows to enable transparency and
reproducibility
• Will be able to identify and minimize bias in research and in
real world applications in a robust and systematic manner
@micheldumontier::NETTAB:2018-10-2216
18. An international, bottom-up paradigm for
the discovery and reuse of digital content
by and for people and machines
@micheldumontier::NETTAB:2018-10-2218
19. • DATA FAIRPORT workshop aimed
to define a minimal (yet
comprehensive) framework for
data discoverability, access,
annotation and authoring
• FAIR acronym was created and
guiding principles drafted
• for comment on FORCE11 website
• Principles were refined during the
2015 BioHackathon in Japan
@micheldumontier::NETTAB:2018-10-2219
FAIR: History
http://www.nature.com/articles/sdata201618
22. FAIR Principles - summarized
Findable
• Globally unique, resolvable, and persistent identifiers
• Machine-readable descriptions to support structured search and
filtering
Accessible
• Metadata is accessible beyond the lifetime of the digital resource
• Clearly defined access and security protocols (FAIR != Open)
@micheldumontier::NETTAB:2018-10-2222
24. FAIR Principles - summarized
Findable
• Globally unique, resolvable, and persistent identifiers
• Machine-readable descriptions to support structured search and filtering
Accessible
• Metadata is accessible beyond the lifetime of the digital resource
• Clearly defined access and security protocols (FAIR != Open)
Interoperable
• Extensible machine interpretable formats for data + metadata
• Use vocabularies and link to other resources
Reusable
• Provide licensing, provenance, and meet community-standards
@micheldumontier::NETTAB:2018-10-2224
25. Improving the FAIRness of digital
resources will increase their quality and
their potential and ease for reuse.
@micheldumontier::NETTAB:2018-10-2225
28. Extent of FAIRness may affect what resources people select
@micheldumontier::NETTAB:2018-10-2228
29. Measuring FAIRness
• A metric is a standard of measurement.
• It must provide clear definition of what is being measured,
why one wants to measure it.
• It must describe what a valid result is and how one obtains
it, so that it can be reproduced by others.
@micheldumontier::NETTAB:2018-10-2229
30. Qualities of a Good Metric
• Clear: anyone can understand the purpose of the metric
• Realistic: compliance should not be unduly complicated
• Objective: the assessment can be made in a quantitative,
machine-interpretable, scalable and reproducible manner
• Discriminating: the measure can distinguish between those
resources that meet the criteria and those that do not
• Universal: The metric should be applicable to all digital
resources
@micheldumontier::NETTAB:2018-10-2230
31. • 14 universal metrics covering each of the FAIR sub-principles. The metrics demand
evidence from the community, some of which may require specific new actions.
• Digital resource providers must provide a web-accessible document with machine-
readable metadata (FM-F2, FM-F3), detail identifier management (FM-F1B), metadata
longevity (FM-A2), and any additional authorization procedures (FM-A1.2).
• They must ensure the public registration of their identifier schemes (FM-F1A), (secure)
access protocols (FM-A1.1), knowledge representation languages (FM-I1), licenses
(FM-R1.1), provenance specifications (FM-R1.2), and community standards (FM-R1.3).
• They must provide evidence of ability to find the digital resource in search results (FM-
F4), linking to other resources (FM-I3), FAIRness of linked resources (FM-I2), and
meeting community standards (FM-R1.3)
@micheldumontier::NETTAB:2018-10-2231
33. Compliance to the standard can be automatically
assessed
@micheldumontier::NETTAB:2018-10-2233
• http://hw-swel.github.io/Validata/
RDF constraint validation tool that is
configurable to any profile
Declarative reusable schema description
Shape Expression (ShEx) constraints
34. A first assessment using the metrics
• Used a simple form to ask for the information needed as input
to the FAIR metrics
• Questions either require one or more URL or true/false
@micheldumontier::NETTAB:2018-10-2234
46. H2020 EG: Turning FAIR Data into Reality -
Report and Action Plan Consultation
(Draft) Recommendations include:
• Sustainable funding for FAIR components (#5)
• Strategic and evidence-based funding (#6)
• Cross-disciplinary FAIRness (#8)
• Encourage and incentivize data reuse (#19)
• Facilitate automated processing (#25)
• Data science and stewardship skills (#26)
• Skills transfer schemes and brokering roles (#27)
• Curriculum frameworks and training (#28)
@micheldumontier::NETTAB:2018-10-2246
Hodson, Simon; Jones, Sarah; Collins, Sandra; Genova, Françoise; Harrower, Natalie; Laaksonen, Leif; Mietchen, Daniel; Petrauskaité, Rūta; Wittenburg, Peter
47. Are we FAIR yet?
• Early claims (including press releases) of being fully FAIR were
vastly premature
• FAIRness assessments can demonstrate standing, and some
aspects of FAIR are much easier to address than others.
• Much more work still needs to be done
– Compatible data and metadata standards across all disciplines (no more
data and metadata silos)
– FAIR by design, using common frameworks
– The development of the FAIR Internet of Data and Services (FIDS) and a
FAIR knowledge graph of available resources
– Automated discovery and workflow execution using FIDS
@micheldumontier::NETTAB:2018-10-2247
48. Will it be worth it?
FAIR addresses, in a concise manner, the basic requirements
associated with publishing and reusing digital resources.
– Lack of high quality meta(data) reduces usability
– Lack of detailed provenance contributes to irreproducibility
– Lack of clear licensing terms hinders innovation
FAIR is set to accelerate research and discovery and will have
worldwide social and economic impact
@micheldumontier::NETTAB:2018-10-2248
50. Summary
• FAIR represents a grassroots and global initiative to enhance
the discovery and reuse of all kinds of digital resources
• The FAIR ecosystem is maturing quickly, and GO-FAIR offers
communities the means to actively participate.
• FAIR demands a new social, ethical and technological
infrastructure that currently doesn’t exist in whole, but has to
be built for and tested by various communities!
• Huge benefits to be had, particularly in augmenting existing
research programs and in automated machine processing, but
needs to be coupled with the proper training and ethics.
@micheldumontier::NETTAB:2018-10-2250
51. Acknowledgements
@micheldumontier::NETTAB:2018-10-2251
FAIR FAIR metrics
Dumontier Lab (Maastricht University, Stanford University, Carleton University)
MU: Seun Adekunle, Remzi Celebi, Dorina Claessens, Ricardo De Miranda Azevedo, Pedro Hernandez Serrano, Massimiliano Grassi, Andine Havelange,
Lianne Ippel, Alexander Malic, Kody Moodley, Stuti Nayak, Nadine Rouleaux, Claudia van open, Chang Sun, Amrapali Zaveri
SU: Sandeep Ayyar, Remzi Celebi, Shima Dastgheib, Maulik Kamdar, David Odgers, Maryam Panahiazar, Amrapali Zaveri
CU: Alison Callahan, Jose Toledo-Cruz, Natalia Villaneuva-Rosales
Abstract
Using meta-analysis of eight independent transplant datasets (236 graft biopsy samples) from four organs, we identified a common rejection module (CRM) consisting of 11 genes that were significantly overexpressed in acute rejection (AR) across all transplanted organs. The CRM genes could diagnose AR with high specificity and sensitivity in three additional independent cohorts (794 samples). In another two independent cohorts (151 renal transplant biopsies), the CRM genes correlated with the extent of graft injury and predicted future injury to a graft using protocol biopsies. Inferred drug mechanisms from the literature suggested that two FDA-approved drugs (atorvastatin and dasatinib), approved for nontransplant indications, could regulate specific CRM genes and reduce the number of graft-infiltrating cells during AR. We treated mice with HLA-mismatched mouse cardiac transplant with atorvastatin and dasatinib and showed reduction of the CRM genes, significant reduction of graft-infiltrating cells, and extended graft survival. We further validated the beneficial effect of atorvastatin on graft survival by retrospective analysis of electronic medical records of a single-center cohort of 2,515 renal transplant patients followed for up to 22 yr. In conclusion, we identified a CRM in transplantation that provides new opportunities for diagnosis, drug repositioning, and rational drug design.