This document discusses where open science is headed and provides one perspective on this issue. It notes that the answer depends on who you ask and then outlines the speaker's background and bias as being from biomedicine and an advocate for open science. It asks what the endpoint of open science should be and discusses some implications of the democratization of science, such as more scrutiny, new types of rewards, and removing artificial boundaries. The speaker then provides some personal examples to illustrate these implications.
Towards Biomedical Research as a Digital EnterprisePhilip Bourne
Philip Bourne outlines his vision of transforming biomedical research into a digital enterprise by making data and other digital assets more open, interoperable, and accessible across boundaries through initiatives like the NIH's Big Data to Knowledge initiative; this would help address issues like the slow pace of discovery and non-reproducibility of research by better connecting scientists and their work.
What Can Happen when Genome Sciences Meets Data Sciences?Philip Bourne
The document discusses the intersection of genome sciences and data sciences. It provides context on data science definitions, relevant examples at NIH, and challenges. The author argues that fully integrating diverse biomedical data sources through open platforms could accelerate research by enabling new discoveries. However, changing entrenched work practices and incentivizing platform use are challenges. The DSI is working to break down silos through collaboration and practical training to help advance open data and digital integration of research workflows.
Will Biomedical Research Fundamentally Change in the Era of Big Data?Philip Bourne
This document discusses how biomedical research may fundamentally change in the era of big data. It notes that biomedical research has always been data-driven, but the scope, variety, complexity and volume of data is now much greater. It also discusses the need for more open data sharing and new tools and methods for large-scale analysis. The document suggests biomedical research may move towards a more collaborative "platform" model, as seen with companies like Airbnb, with the goal of improving data access, reuse and reproducibility of research. However, overcoming challenges like incentives, trust and work practices will be important for any new platform to succeed.
Gather evidence to demonstrate the impact of your researchIUPUI
This workshop is the 3rd in a series of 4 titled "Maximize your impact" offered by the IUPUI University Library Center for Digital Scholarship. Faculty must provide strong evidence of impact in order to achieve promotion and tenure. Having strong evidence in year 5 is made easier by strategic dissemination early in your tenure track. In this hands-on workshop, we will introduce key sources of evidence to support your case, demonstrate strategies for gathering this evidence, and provide a variety of examples. These sources include citation metrics, article level metrics, and altmetrics as indicators of impact to support your narrative of excellence.
The Thinking Behind Big Data at the NIHPhilip Bourne
- The document discusses the challenges and opportunities presented by big data in biomedical research. It highlights issues like lack of reproducibility, need for data sharing and standards, and ensuring sustainability of data resources.
- The NIH is organizing various initiatives through the Big Data to Knowledge program to address these issues. This includes developing a biomedical research data commons, training programs, funding for innovation, and modified review processes.
- The goal is to improve data accessibility, support for workflows, relationships with publishers, and metrics to measure reproducibility and reward data sharing. This will help close the research lifecycle loop and advance biomedical discovery.
Open Access Scholarly Communications Series: How to Improve Research(er) Visi...Nader Ale Ebrahim
Most researchers are evaluated based on their publications as well as the number of citations their publications receive. One of the key ways to increase citations is to expose the research output to the widest possible audience. Publishing a high quality paper in a scientific journal is only half the way of receiving citations in the future. The rest of the journey is advertising and disseminating the publications by using the proper “Research Tools”. Citations to an article strongly depend on the visibility, rather than the merit of the article. Researchers have spent plenty of time and effort on writing their research for publication. However, most of the researchers stop their work after publication. Dissemination and archiving an article is an essential phase of the publication life cycle. There are tools that help to improve the research visibility and impact. Effective use of these tools, which will be elaborated in this webinar, can result in increased visibility and, thus, improve the paper citations, sequentially research impact, and university ranking.
Towards Biomedical Research as a Digital EnterprisePhilip Bourne
Philip Bourne outlines his vision of transforming biomedical research into a digital enterprise by making data and other digital assets more open, interoperable, and accessible across boundaries through initiatives like the NIH's Big Data to Knowledge initiative; this would help address issues like the slow pace of discovery and non-reproducibility of research by better connecting scientists and their work.
What Can Happen when Genome Sciences Meets Data Sciences?Philip Bourne
The document discusses the intersection of genome sciences and data sciences. It provides context on data science definitions, relevant examples at NIH, and challenges. The author argues that fully integrating diverse biomedical data sources through open platforms could accelerate research by enabling new discoveries. However, changing entrenched work practices and incentivizing platform use are challenges. The DSI is working to break down silos through collaboration and practical training to help advance open data and digital integration of research workflows.
Will Biomedical Research Fundamentally Change in the Era of Big Data?Philip Bourne
This document discusses how biomedical research may fundamentally change in the era of big data. It notes that biomedical research has always been data-driven, but the scope, variety, complexity and volume of data is now much greater. It also discusses the need for more open data sharing and new tools and methods for large-scale analysis. The document suggests biomedical research may move towards a more collaborative "platform" model, as seen with companies like Airbnb, with the goal of improving data access, reuse and reproducibility of research. However, overcoming challenges like incentives, trust and work practices will be important for any new platform to succeed.
Gather evidence to demonstrate the impact of your researchIUPUI
This workshop is the 3rd in a series of 4 titled "Maximize your impact" offered by the IUPUI University Library Center for Digital Scholarship. Faculty must provide strong evidence of impact in order to achieve promotion and tenure. Having strong evidence in year 5 is made easier by strategic dissemination early in your tenure track. In this hands-on workshop, we will introduce key sources of evidence to support your case, demonstrate strategies for gathering this evidence, and provide a variety of examples. These sources include citation metrics, article level metrics, and altmetrics as indicators of impact to support your narrative of excellence.
The Thinking Behind Big Data at the NIHPhilip Bourne
- The document discusses the challenges and opportunities presented by big data in biomedical research. It highlights issues like lack of reproducibility, need for data sharing and standards, and ensuring sustainability of data resources.
- The NIH is organizing various initiatives through the Big Data to Knowledge program to address these issues. This includes developing a biomedical research data commons, training programs, funding for innovation, and modified review processes.
- The goal is to improve data accessibility, support for workflows, relationships with publishers, and metrics to measure reproducibility and reward data sharing. This will help close the research lifecycle loop and advance biomedical discovery.
Open Access Scholarly Communications Series: How to Improve Research(er) Visi...Nader Ale Ebrahim
Most researchers are evaluated based on their publications as well as the number of citations their publications receive. One of the key ways to increase citations is to expose the research output to the widest possible audience. Publishing a high quality paper in a scientific journal is only half the way of receiving citations in the future. The rest of the journey is advertising and disseminating the publications by using the proper “Research Tools”. Citations to an article strongly depend on the visibility, rather than the merit of the article. Researchers have spent plenty of time and effort on writing their research for publication. However, most of the researchers stop their work after publication. Dissemination and archiving an article is an essential phase of the publication life cycle. There are tools that help to improve the research visibility and impact. Effective use of these tools, which will be elaborated in this webinar, can result in increased visibility and, thus, improve the paper citations, sequentially research impact, and university ranking.
Bibliometric is a method for measuring, monitoring, and studying scientific outputs in a given area for various purposes, such as prospecting research opportunities and substantiating scientific research. Bibliometric is one family of measures that uses a variety of approaches for counting publication, citation, co-citation, bibliographic coupling, keyword co-occurrence, and co-authorship networks. Information technology (IT) tools can be used to assist the process of searching for relevant scientific contents, collecting scientific data, and summarizing the results obtained. Bibliometric paper can be written before writing a literature review article and at the introduction section of any research papers. In this workshop, you will get familiar with “How to Write Your First Bibliometric Paper”.
Open Science: Where Theory Meets PracticePhilip Bourne
The document discusses open science and how theory meets practice. It provides background on the speaker and their experience. It then discusses a case study where open sharing of data could have identified a targetable mutation in a rare childhood brain cancer, DIPG, 3 years earlier, potentially helping 180 children. The document advocates for open science and data sharing to accelerate research.
How to Use Bibliometric Study for Writing a Paper: A Starter GuideNader Ale Ebrahim
This document provides information about a workshop on how to use bibliometric studies for writing papers. It includes an abstract describing bibliometrics as a statistical tool for mapping research areas. It then lists topics that will be covered, including definitions, selecting search terms, retrieving and analyzing data, examples, and a question and answer section. Bibliometric methods can help identify relevant research problems and objectives. The document demonstrates how bibliometric analysis can be conducted before writing a literature review or research paper.
Keynote speech - Carole Goble - Jisc Digital Festival 2015Jisc
Carole Goble is a professor in the school of computer science at the University of Manchester.
In this keynote, Carole offered her insights into research data management and data centres.
From Where Have We Come & Where Are We GoingPhilip Bourne
This document discusses the past and future of FORCE11, a community dedicated to improving scholarly communication. It notes that since 2011:
- New communities are defined by interests rather than domains
- Open data, identifiers, and data/software citation have emerged
It also discusses challenges like maintaining a biomedical focus and opportunities like engaging other communities and pursuing public-private partnerships. Specific opportunities mentioned include pursuing community funding, gaining traction for preprints in life sciences, and leveraging touchpoints with funders around issues like reproducibility, data management, and sustainability. The document encourages stakeholders to identify and pursue these opportunities to help shape the research ecosystem.
Citation Tracking: Enrich Research Visibility and ImpactNader Ale Ebrahim
Citation tracking is used to discover the most influential articles and how often researchers own published papers are cited. Citation tracking allows you to find out which bits of your work are appreciated and used by other academics. As a general rule, high quality articles attract a greater number of citations. There are some indexing services which keep track of citations; however, no single database covers all works that cite other works. Scopus, Web of Science (WOS), Google Scholar, ResearchGate, are some examples of Citation Tracking tools. In this webinar, I will introduce some important tools for tracking your scholarly output citations and know how often (and by whom) your paper has been cited already?
Improve Research Visibility and Impact by Contributing to WikipediaNader Ale Ebrahim
This document contains a presentation by Nader Ale Ebrahim on improving research visibility and impact by contributing to Wikipedia. The presentation covers topics such as what Wikipedia is, how to create an account and contribute articles, how to format text and citations, and how contributions to Wikipedia can positively impact altmetric scores and research visibility. The goal of the presentation is to explain how posting and editing content on Wikipedia is one way for researchers to increase the visibility and impact of their work.
Introduction to Research Data Management at UWAKatina Toufexis
This document summarizes the key benefits of research data management. It discusses how research data management helps with compliance by meeting requirements of international and national funding agencies as well as publishers. It also promotes efficiency in the research process, ensures security of data, allows access for validation and collaboration, and improves quality through enabling replication. The document provides an overview of the Research Data Management Toolkit available at UWA to support researchers in managing their data over the research lifecycle.
Research Skills Session 8: Avoid Scientific MisconductNader Ale Ebrahim
One of the most important research ethical issues that should be taken into consideration is “scientific misconduct” such as fabrication, falsification and plagiarism. Plagiarism can occur at any stage of the research activities such as reporting, communicating, authoring, and peer review. The purpose of this workshop is to engage researchers in their responsibility to conduct an ethical research.
Research Data Management Services at UWA (November 2015)Katina Toufexis
Research Data Management Services at the University of Western Australia (November 2015).
Created by Katina Toufexis of the eResearch Support Unit (University Library).
CC-BY
The document discusses the rise of data science and its disruptive impact on higher education. It analyzes precedents like bioinformatics that were enabled by new digital data sources and technologies. The author advocates that universities should embrace data science by establishing interdisciplinary collaborations, investing in data infrastructure, and ensuring research has societal value and responsibility.
This document provides an overview of a workshop on managing research. The workshop is led by Dr. Nader Ale Ebrahim and covers various research tools that can help researchers plan and manage their work more effectively. The document includes an agenda with topics like selecting keywords, finding relevant papers, evaluating sources, and improving research visibility. It also presents examples of research management tools like Microsoft OneNote, reference managers, and mind mapping software. Various tasks are outlined for attendees, such as creating literature maps and logs of searched terms. The goal is to introduce tools and strategies that can guide researchers on the correct path and produce higher quality outputs.
Genome sharing projects around the world nijmegen oct 29 - 2015Fiona Nielsen
Genome sharing projects across the world
Did you ever wonder what happened to the exponential increase in genome sequencing data? It is out there around the world and a lot of it is consented for research use. This means that if you just know where to find the data, you can potentially analyse gigabytes of data to power your research.
In this talk Fiona will present community genome initiatives, the genome sharing projects across the world, how you can benefit from this wealth of data in your work, and how you can boost your academic career by sharing and collaboration.
by Fiona Nielsen, Founder and CEO of DNAdigest and Repositive
With a background in software development Fiona pursued her career in bioinformatics research at Radboud University Nijmegen. Now a scientist-turned-entrepreneur Fiona founded DNAdigest and its social enterprise spin-out Repositive Ltd. Both the charity and company focus on efficient and ethical sharing of genetics data for research to accelerate diagnostics and cures for genetic diseases.
Increasing transparency in Medical Education through Open Data Rebecca Grant
Slides presented at the AMEE Virtual Conference 2021, introducing the MedEdPublish platform and data policies. Approaches to sharing sensitive human data, and particulary qualitative data, are discussed.
Do Open data badges influence author behaviour? A case study at Springer NatureRebecca Grant
Digital badges have previously been shown to incentivise journal authors to share their data openly. In this paper we introduce an Open data badging project at the Springer Nature journal BMC Microbiology. The development of the Open data badge is described, as well as the challenges of developing standard badging criteria and ensuring authors’ awareness of the badges. Next steps for the badging project are outlined, which are based on the experiences of the team assessing the badges, the number of badges awarded at the journal to date, and the results of an author survey.
Figshare for institutions - Jisc Digifest 2016Jisc
In May 2015 the EPSRC policy framework on research data came into effect. Salford University partnered with figshare to not only answer the mandate but to enhance the visibility of the research generated at the institution. All public facing research outputs are freely available to the wider public at salford.figshare.com.
Learn more about University of Salford’s approach and get a high level overview of the latest figshare functionality.
This document summarizes a presentation about open data and science in Africa. It discusses the benefits of open data, such as enabling more informed decisions and driving development. It also addresses challenges like researchers' fears of having errors or incomplete data exposed. The presentation promotes the African Open Science Platform, which aims to establish open data policies and build capacity through workshops on data skills. The platform connects stakeholders to advance open data and science across Africa.
Recently many Artificial Intelligence (AI) tools were developed to summarize and extract the highlighted points of documents. Such tools will assist the research to evaluate a paper content before full reading the entire document. In this workshop, Dr. Nader introduces some tools for reading a paper from his Research Tools Mind Map. The Research Tools enable researchers to follow the correct path in research and to ultimately produce high-quality research outputs with more accuracy and efficiency.
Philip Bourne presented his viewpoint on the future of open science at an NIAID workshop. He argued that as science becomes more democratized, it will lead to more scrutiny of research, a need for new types of rewards beyond publications and citations, and a removal of artificial boundaries between fields. As an example, he discussed how open science allowed two researchers working in different areas to connect via common data references in their notebooks. Bourne believes this digitization and interconnection of research will accelerate, transforming institutions into digital enterprises where digital assets are identifiable and interoperable. However, fully realizing this vision will require coordinating tools across the research lifecycle through common frameworks and developing new support structures.
Bibliometric is a method for measuring, monitoring, and studying scientific outputs in a given area for various purposes, such as prospecting research opportunities and substantiating scientific research. Bibliometric is one family of measures that uses a variety of approaches for counting publication, citation, co-citation, bibliographic coupling, keyword co-occurrence, and co-authorship networks. Information technology (IT) tools can be used to assist the process of searching for relevant scientific contents, collecting scientific data, and summarizing the results obtained. Bibliometric paper can be written before writing a literature review article and at the introduction section of any research papers. In this workshop, you will get familiar with “How to Write Your First Bibliometric Paper”.
Open Science: Where Theory Meets PracticePhilip Bourne
The document discusses open science and how theory meets practice. It provides background on the speaker and their experience. It then discusses a case study where open sharing of data could have identified a targetable mutation in a rare childhood brain cancer, DIPG, 3 years earlier, potentially helping 180 children. The document advocates for open science and data sharing to accelerate research.
How to Use Bibliometric Study for Writing a Paper: A Starter GuideNader Ale Ebrahim
This document provides information about a workshop on how to use bibliometric studies for writing papers. It includes an abstract describing bibliometrics as a statistical tool for mapping research areas. It then lists topics that will be covered, including definitions, selecting search terms, retrieving and analyzing data, examples, and a question and answer section. Bibliometric methods can help identify relevant research problems and objectives. The document demonstrates how bibliometric analysis can be conducted before writing a literature review or research paper.
Keynote speech - Carole Goble - Jisc Digital Festival 2015Jisc
Carole Goble is a professor in the school of computer science at the University of Manchester.
In this keynote, Carole offered her insights into research data management and data centres.
From Where Have We Come & Where Are We GoingPhilip Bourne
This document discusses the past and future of FORCE11, a community dedicated to improving scholarly communication. It notes that since 2011:
- New communities are defined by interests rather than domains
- Open data, identifiers, and data/software citation have emerged
It also discusses challenges like maintaining a biomedical focus and opportunities like engaging other communities and pursuing public-private partnerships. Specific opportunities mentioned include pursuing community funding, gaining traction for preprints in life sciences, and leveraging touchpoints with funders around issues like reproducibility, data management, and sustainability. The document encourages stakeholders to identify and pursue these opportunities to help shape the research ecosystem.
Citation Tracking: Enrich Research Visibility and ImpactNader Ale Ebrahim
Citation tracking is used to discover the most influential articles and how often researchers own published papers are cited. Citation tracking allows you to find out which bits of your work are appreciated and used by other academics. As a general rule, high quality articles attract a greater number of citations. There are some indexing services which keep track of citations; however, no single database covers all works that cite other works. Scopus, Web of Science (WOS), Google Scholar, ResearchGate, are some examples of Citation Tracking tools. In this webinar, I will introduce some important tools for tracking your scholarly output citations and know how often (and by whom) your paper has been cited already?
Improve Research Visibility and Impact by Contributing to WikipediaNader Ale Ebrahim
This document contains a presentation by Nader Ale Ebrahim on improving research visibility and impact by contributing to Wikipedia. The presentation covers topics such as what Wikipedia is, how to create an account and contribute articles, how to format text and citations, and how contributions to Wikipedia can positively impact altmetric scores and research visibility. The goal of the presentation is to explain how posting and editing content on Wikipedia is one way for researchers to increase the visibility and impact of their work.
Introduction to Research Data Management at UWAKatina Toufexis
This document summarizes the key benefits of research data management. It discusses how research data management helps with compliance by meeting requirements of international and national funding agencies as well as publishers. It also promotes efficiency in the research process, ensures security of data, allows access for validation and collaboration, and improves quality through enabling replication. The document provides an overview of the Research Data Management Toolkit available at UWA to support researchers in managing their data over the research lifecycle.
Research Skills Session 8: Avoid Scientific MisconductNader Ale Ebrahim
One of the most important research ethical issues that should be taken into consideration is “scientific misconduct” such as fabrication, falsification and plagiarism. Plagiarism can occur at any stage of the research activities such as reporting, communicating, authoring, and peer review. The purpose of this workshop is to engage researchers in their responsibility to conduct an ethical research.
Research Data Management Services at UWA (November 2015)Katina Toufexis
Research Data Management Services at the University of Western Australia (November 2015).
Created by Katina Toufexis of the eResearch Support Unit (University Library).
CC-BY
The document discusses the rise of data science and its disruptive impact on higher education. It analyzes precedents like bioinformatics that were enabled by new digital data sources and technologies. The author advocates that universities should embrace data science by establishing interdisciplinary collaborations, investing in data infrastructure, and ensuring research has societal value and responsibility.
This document provides an overview of a workshop on managing research. The workshop is led by Dr. Nader Ale Ebrahim and covers various research tools that can help researchers plan and manage their work more effectively. The document includes an agenda with topics like selecting keywords, finding relevant papers, evaluating sources, and improving research visibility. It also presents examples of research management tools like Microsoft OneNote, reference managers, and mind mapping software. Various tasks are outlined for attendees, such as creating literature maps and logs of searched terms. The goal is to introduce tools and strategies that can guide researchers on the correct path and produce higher quality outputs.
Genome sharing projects around the world nijmegen oct 29 - 2015Fiona Nielsen
Genome sharing projects across the world
Did you ever wonder what happened to the exponential increase in genome sequencing data? It is out there around the world and a lot of it is consented for research use. This means that if you just know where to find the data, you can potentially analyse gigabytes of data to power your research.
In this talk Fiona will present community genome initiatives, the genome sharing projects across the world, how you can benefit from this wealth of data in your work, and how you can boost your academic career by sharing and collaboration.
by Fiona Nielsen, Founder and CEO of DNAdigest and Repositive
With a background in software development Fiona pursued her career in bioinformatics research at Radboud University Nijmegen. Now a scientist-turned-entrepreneur Fiona founded DNAdigest and its social enterprise spin-out Repositive Ltd. Both the charity and company focus on efficient and ethical sharing of genetics data for research to accelerate diagnostics and cures for genetic diseases.
Increasing transparency in Medical Education through Open Data Rebecca Grant
Slides presented at the AMEE Virtual Conference 2021, introducing the MedEdPublish platform and data policies. Approaches to sharing sensitive human data, and particulary qualitative data, are discussed.
Do Open data badges influence author behaviour? A case study at Springer NatureRebecca Grant
Digital badges have previously been shown to incentivise journal authors to share their data openly. In this paper we introduce an Open data badging project at the Springer Nature journal BMC Microbiology. The development of the Open data badge is described, as well as the challenges of developing standard badging criteria and ensuring authors’ awareness of the badges. Next steps for the badging project are outlined, which are based on the experiences of the team assessing the badges, the number of badges awarded at the journal to date, and the results of an author survey.
Figshare for institutions - Jisc Digifest 2016Jisc
In May 2015 the EPSRC policy framework on research data came into effect. Salford University partnered with figshare to not only answer the mandate but to enhance the visibility of the research generated at the institution. All public facing research outputs are freely available to the wider public at salford.figshare.com.
Learn more about University of Salford’s approach and get a high level overview of the latest figshare functionality.
This document summarizes a presentation about open data and science in Africa. It discusses the benefits of open data, such as enabling more informed decisions and driving development. It also addresses challenges like researchers' fears of having errors or incomplete data exposed. The presentation promotes the African Open Science Platform, which aims to establish open data policies and build capacity through workshops on data skills. The platform connects stakeholders to advance open data and science across Africa.
Recently many Artificial Intelligence (AI) tools were developed to summarize and extract the highlighted points of documents. Such tools will assist the research to evaluate a paper content before full reading the entire document. In this workshop, Dr. Nader introduces some tools for reading a paper from his Research Tools Mind Map. The Research Tools enable researchers to follow the correct path in research and to ultimately produce high-quality research outputs with more accuracy and efficiency.
Philip Bourne presented his viewpoint on the future of open science at an NIAID workshop. He argued that as science becomes more democratized, it will lead to more scrutiny of research, a need for new types of rewards beyond publications and citations, and a removal of artificial boundaries between fields. As an example, he discussed how open science allowed two researchers working in different areas to connect via common data references in their notebooks. Bourne believes this digitization and interconnection of research will accelerate, transforming institutions into digital enterprises where digital assets are identifiable and interoperable. However, fully realizing this vision will require coordinating tools across the research lifecycle through common frameworks and developing new support structures.
1. The document discusses some early observations from the Associate Director for Data Science at the National Institutes of Health regarding data at NIH.
2. It notes that NIH does not fully understand how existing data is used, has focused more on why data should be shared rather than how to share it, and lacks plans for long-term sustainability of data.
3. Potential solutions discussed include developing a biomedical commons, modifying the review process, improving education in data science, and expanding the Big Data to Knowledge initiative. The goal is to create a digital research enterprise that better connects all aspects of the research lifecycle.
Internal NIH Seminar to the BISTI Team on some early thoughts from the Associate Director for Data Science (ADDS). These ideas are for discussion only and in no way reflect what might happen subsequently. Presented April 1, 2014 (the date is purely a coincidence).
The document provides logistics for a webinar on data curation profiles and the DMPTool. It includes instructions for calling into the audio, asking questions in the chat, and finding recordings and slides. The webinar will discuss the history of data curation profiles, comparing them to data management plans, and a case study of using data curation profiles. Data curation profiles involve interviewing researchers about their data practices and needs in order to understand how to support them, while data management plans focus on requirements for funding. Both tools can help librarians engage with researchers, though data curation profiles provide a more in-depth understanding of researchers' full data lifecycles.
Open from beginning to end: addressing barriers to open research - a personal...UoLResearchSupport
Open and reproducible research practises are increasingly recognised as important to scientific integrity. However, there are numerous barriers including research culture - whether as a sector, institution or discipline - lack of training and professional incentives and funding of infrastructure.
On 26 May 2021 Dr Marlene Mengoni was one of two speakers at an event exploring barriers to open research.
Dr Marlene Mengoni is a member of the Institute of Medical & Biological Engineering (IMBE) at the University of Leeds and is interested in theoretical aspects of musculoskeletal tissues biomechanics with a fundamental computational engineering approach.
Speaking from an engineering perspective, Dr Mengoni discussed how the research culture at the University of Leeds can help to foster open research practices, throughout the research cycle, including embedding "open" in research and training.
This document provides an overview of Philip Bourne's early observations and thoughts regarding data management at the NIH. Some of the key points are: 1) Existing data resources are not well understood in terms of how they are used; 2) There is a need to focus on how data will be managed and shared, not just why it should be; 3) There is no NIH-wide sustainability plan for data management; 4) Training in biomedical data science is inconsistent. The document discusses some potential solutions such as establishing a NIH data commons and improving training programs.
There are many online and in-person courses available for librarians to learn about research data management, data analysis, and visualization, but after you have taken a course, how do you go about applying what you have learned? While it is possible to just start offering classes and consultations, your service will have a better chance of becoming relevant if you consider stakeholders and review your institutional environment. This lecture will give you some ideas to get started with data services at your institution.
Elsevier CWTS Open Data Report Presentation at RDA meeting in Barcelona Elsevier
The Open Data report is a result of a year-long, co-conducted study between Elsevier and the Centre for Science and Technology Studies (CWTS), part of Leiden University, the Netherlands. The study is based on a complementary methods approach consisting of a quantitative analysis of bibliometric and publication data, a global survey of 1,200 researchers and three case studies including in-depth interviews with key individuals involved in data collection, analysis and deposition in the fields of soil science, human genetics and digital humanities.
The document summarizes a workshop hosted by the NIH Associate Director for Data Science to discuss charting the future of data science at NIH. The workshop goals were to get input from all stakeholders, identify strategic directions, policies, and funding initiatives, and have participants leave as advocates and supporters. The agenda included providing background, open discussion, identifying topics for breakout groups, subgroup discussions, and reporting back. The document provides context on current NIH data science efforts and examples of collaborators in building a biomedical research digital enterprise.
This document discusses research data management and related issues. It defines research data as any information used in research, including observational, experimental, and simulated data. Proper research data management is important for data preservation, access, and reuse. Institutions should establish research data services and policies to address questions around data ownership, sharing standards, and long-term preservation.
This document discusses the need for critical infrastructure to promote data synthesis and evidence-based nutrient management. It outlines 10 steps for real-time data uptake, analysis, and customized nutrient recommendations. Key challenges include data standards, minimum data sets, provenance, and repositories. The Purdue University Research Repository is presented as a solution, providing preservation, curation, and publication of agricultural data. Hands-on support from librarians and agronomists is discussed to help researchers transition data and ensure best practices.
What Data Science Will Mean to You - One Person's ViewPhilip Bourne
This document provides an overview of data science from the perspective of Philip Bourne. Some key points:
- Data science is disruptive to higher education and all disciplines are being impacted by large amounts of digital data.
- Data science can be defined using a 4+1 model focusing on value, design, systems, analytics, and practice.
- Principles of excellence, inclusivity, openness, and fairness should guide data science work.
- Lessons from advances in computational biology and AlphaFold2 show the importance of open data, collaboration, and engineering challenges.
- A data science school should focus on responsible data practices while balancing open research that benefits patients.
Presentation by Prof Lisa Askie, ANZCTR, to the 'Unlocking value from publicly funded Clinical Research Data' workshop, cohosted by ARDC and CSIRO at ANU on 6 March 2019.
This document discusses researchers' perspectives and practices around sharing research data. It finds that while data gathering is not usually the primary focus of research, complete data sharing is important for validation and avoiding duplication. However, researchers face many barriers to sharing, including lack of incentives, time, skills and standards. It also examines funder policies encouraging data sharing and the need to build infrastructure and address cultural and career issues to better support a research culture of open data.
A talk at the Urban Science workshop at the Puget Sound Regional Council July 20 2014 organized by the Northwest Institute for Advanced Computing, a joint effort between Pacific Northwest National Labs and the University of Washington.
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
This document discusses the changing landscape of data science and AI in biomedicine. Some key points:
- We are at a tipping point where data science is becoming a driver of biomedical research rather than just a tool. Biomedical researchers need to become data scientists.
- Data science is interdisciplinary and touches every field due to the rise of digital data. It requires openness, translation of findings, and consideration of responsibilities like algorithmic bias.
- Advances like AlphaFold2 show the power of large collaborative efforts combining data, computing resources, engineering, and domain expertise. This points to the need for public-private partnerships and new models of open data sharing.
- The definition of
AI in Medical Education A Meta View to Start a ConversationPhilip Bourne
- AI has the potential to significantly impact medical education and healthcare.
- Chatbots and large language models can provide a rich training ground for students to learn, while augmented reality may change the student-patient dynamic.
- AI tools like predictive analytics and imaging analysis can assist in research, diagnosis, and personalized treatment, but models are still limited and education of implications is needed.
- If developed responsibly with oversight, AI could help democratize healthcare and create new industries, but history shows technology disruptions can also lead to deception if misused. The impacts and timeline of AI in medicine remain uncertain.
AI+ Now and Then How Did We Get Here And Where Are We GoingPhilip Bourne
The document discusses the past, present, and future of artificial intelligence (AI). It describes how AI has advanced due to increases in data and improvements in algorithms and computing technology. An example of AI, ChatGPT, is discussed as using large language models, pre-training, and transformers to generate language. The future of AI is uncertain but could involve neural networks that mimic the brain more closely. AI may disrupt many industries like education and research in the coming years or decades through forces of digitization, disruption, and other factors. The impacts and timeline of AI progress are difficult to predict precisely.
Thoughts on Biological Data SustainabilityPhilip Bourne
This document discusses approaches to improving biological data sustainability. It proposes moving from the current BDS 1.0 model to a BDS 2.0 model. BDS 1.0 is characterized by increasing data and costs but decreasing funds for innovation. BDS 2.0 would recognize the monetary value of data and embrace public-private partnerships and a data economy. It suggests a "data credits" system where data curation is a service with monetary value. The document provides examples of how this could work for the Protein Data Bank (PDB) and more globally. It argues BDS 2.0 could encourage competition, globalization, and private sector engagement to better foster sustainable and FAIR biological data.
The document discusses FAIR data and its importance. FAIR stands for Findable, Accessible, Interoperable, and Reusable. The author argues that data science is becoming a major driver in many fields due to the large amounts of digital data being created. For data and data science to reach their full potential, data needs to be FAIR so it can be easily discovered, accessed, integrated and reused. An example is given of a researcher combining health and vehicle crash data using techniques from data science to improve emergency care. Making data FAIR enables greater collaboration, public-private partnerships and opportunities for translation.
Data Science Meets Biomedicine, Does Anything ChangePhilip Bourne
Data science is driving major changes in biomedical research by enabling new types of integrative, multi-scale analyses. However, biomedical research may no longer lead data science due to a lack of comprehensive data infrastructure and cultural barriers. Responsible data science that balances openness, ethics, and benefiting patients could help establish biomedicine's continued leadership role. Major challenges include limited resources, attracting diverse talent, and prioritizing strategic initiatives over conforming to traditional models of research.
Presented online as part of the NASM series in Advancing Drug Discovery see https://www.nationalacademies.org/event/40883_09-2023_advancing-drug-discovery-data-science-meets-drug-discovery
Biomedical Data Science: We Are Not AlonePhilip Bourne
This document discusses biomedical data science and the opportunities and challenges presented by new developments in data science. Some key points:
- We are at a tipping point where biomedical research is no longer the sole leader in data science due to advances in many other fields. Biomedical researchers need to become data scientists to stay relevant.
- Data science is being driven by the massive growth of digital data and requires an interdisciplinary approach. It is touching every field and attracting many students.
- Developing effective data systems and infrastructure is a major challenge to enable open sharing and analysis of data. Initiatives are underway but more collaboration is needed across sectors.
- Advances in machine learning, like Alpha
BIMS7100-2023. Social Responsibility in ResearchPhilip Bourne
Social responsibility in research refers to conducting studies that benefit society while avoiding harm. It involves considering risks and benefits to human and animal subjects, ensuring transparency and integrity, and engaging stakeholders. Socially responsible research also addresses equity, diversity and inclusion. Data sharing is an important aspect of social responsibility, as it enables reproducibility and collaborative research. However, data must be shared in a FAIR manner and maintained over time to realize its full benefits. Researchers should consider social responsibility throughout the entire research lifecycle.
The document provides an overview of the School of Data Science at the University of Virginia and its approach to collaborating with Novo Nordisk on diabetes research. It discusses that the School of Data Science aims to catalyze discovery through interdisciplinary research, educate a diverse workforce, and serve the community by applying data science. It also provides examples of using artificial intelligence to recognize patterns related to diabetes and details potential areas of collaboration between the School and Novo Nordisk, including student projects, visiting fellows, faculty partnerships, and PhD mentorship.
Towards a US Open research Commons (ORC)Philip Bourne
On August 2nd, 2021, US scientists and officials met to discuss establishing a US Open Research Commons (ORC) to make research data and computing resources more accessible and interoperable across public and private sectors. Currently, US resources are siloed and limited in discoverability. Other countries have established similar initiatives that the US is not formally represented in. An ORC could pool resources to benefit a more diverse group of researchers in addressing societal challenges, but establishing one requires overcoming cultural and institutional barriers between agencies through policy leadership. Immediate action is needed for the US to remain competitive in open science.
This document discusses opportunities for precision education arising from the move to digital education during the COVID pandemic. It notes that for the first time, essentially all educational materials were digital, creating opportunities to make content findable, accessible, interoperable, and reusable. This could improve content quality through transparency and ratings similar to academic publishing. It also enables aggregated views of content and student performance, improved content and syllabi, and recommender systems. Challenges include issues around content ownership, sharing rules, bias, privacy, security, and adoption of next generation learning management systems.
Philip Bourne presented on how data can advance sustainability. He discussed how high throughput DNA digital data changed biomedicine and spawned the new field of data science. Data science now touches all domains, including helping achieve UN Goal 10 of reducing inequalities through projects like using data to understand the history of Native American displacement. While data presents endless opportunities, it also has weaknesses like being messy and non-conclusive, and threats like bias and lack of training. Bourne advocates for building trust through evidence and creating an open data environment to realize data's potential, while acknowledging that sustaining open data faces challenges around proprietary concerns, security, and driving social change.
Frontiers of Computing at the Cellular and Molecular ScalesPhilip Bourne
3 basic points when establishing a new biomedical initiative. Presented at Frontiers of Computing in Health and Society, George Mason University, September 21, 2021.
The document discusses the importance of social responsibility in research. It makes three key points:
1. Research should maximize benefit to society by making findings accessible and usable by the public who funds the research.
2. Under certain conditions like privacy, all research should be openly accessible so others can build upon it.
3. Most research data is lost within 10-15 years of publication according to studies, highlighting the need for open data standards to ensure long-term availability.
NITRD Big Data Interagency Working Group Workshop: Pioneering the Future of Federally Supported Data Repositories Jan 13, 2021 - Opening comments on where we are and one suggestion of where we might go with an International Data Science Institute (IDSI) - A blue sky view.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
How to Manage Your Lost Opportunities in Odoo 17 CRMCeline George
Odoo 17 CRM allows us to track why we lose sales opportunities with "Lost Reasons." This helps analyze our sales process and identify areas for improvement. Here's how to configure lost reasons in Odoo 17 CRM
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analytics” feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
Leveraging Generative AI to Drive Nonprofit InnovationTechSoup
In this webinar, participants learned how to utilize Generative AI to streamline operations and elevate member engagement. Amazon Web Service experts provided a customer specific use cases and dived into low/no-code tools that are quick and easy to deploy through Amazon Web Service (AWS.)
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
LAND USE LAND COVER AND NDVI OF MIRZAPUR DISTRICT, UPRAHUL
This Dissertation explores the particular circumstances of Mirzapur, a region located in the
core of India. Mirzapur, with its varied terrains and abundant biodiversity, offers an optimal
environment for investigating the changes in vegetation cover dynamics. Our study utilizes
advanced technologies such as GIS (Geographic Information Systems) and Remote sensing to
analyze the transformations that have taken place over the course of a decade.
The complex relationship between human activities and the environment has been the focus
of extensive research and worry. As the global community grapples with swift urbanization,
population expansion, and economic progress, the effects on natural ecosystems are becoming
more evident. A crucial element of this impact is the alteration of vegetation cover, which plays a
significant role in maintaining the ecological equilibrium of our planet.Land serves as the foundation for all human activities and provides the necessary materials for
these activities. As the most crucial natural resource, its utilization by humans results in different
'Land uses,' which are determined by both human activities and the physical characteristics of the
land.
The utilization of land is impacted by human needs and environmental factors. In countries
like India, rapid population growth and the emphasis on extensive resource exploitation can lead
to significant land degradation, adversely affecting the region's land cover.
Therefore, human intervention has significantly influenced land use patterns over many
centuries, evolving its structure over time and space. In the present era, these changes have
accelerated due to factors such as agriculture and urbanization. Information regarding land use and
cover is essential for various planning and management tasks related to the Earth's surface,
providing crucial environmental data for scientific, resource management, policy purposes, and
diverse human activities.
Accurate understanding of land use and cover is imperative for the development planning
of any area. Consequently, a wide range of professionals, including earth system scientists, land
and water managers, and urban planners, are interested in obtaining data on land use and cover
changes, conversion trends, and other related patterns. The spatial dimensions of land use and
cover support policymakers and scientists in making well-informed decisions, as alterations in
these patterns indicate shifts in economic and social conditions. Monitoring such changes with the
help of Advanced technologies like Remote Sensing and Geographic Information Systems is
crucial for coordinated efforts across different administrative levels. Advanced technologies like
Remote Sensing and Geographic Information Systems
9
Changes in vegetation cover refer to variations in the distribution, composition, and overall
structure of plant communities across different temporal and spatial scales. These changes can
occur natural.
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxEduSkills OECD
Iván Bornacelly, Policy Analyst at the OECD Centre for Skills, OECD, presents at the webinar 'Tackling job market gaps with a skills-first approach' on 12 June 2024
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
South African Journal of Science: Writing with integrity workshop (2024)
Where is Open Going?
1. Where is Open Going?
Philip E. Bourne
pbourne@ucsd.edu
http://www.slideshare.net/pebourne/
3/01/14
2014 SPARC Annual Meeting
1
2. Where is Open Going?
The answer depends on who you
ask
Here is my biased viewpoint
3/01/14
2014 SPARC Annual Meeting
2
3. My Background/Bias
• Mostly Biomedical
• RCSB PDB/IEDB Database Developer – Views on
community, quality, sustainability …
• PLOS Journal Co-founder – Open Science Advocate
• Associate Vice Chancellor for Innovation – Business
models, interaction with the private
sector,sustainability
• Professor – Mentoring, reward system, value (or not)
of research
• NIH Strategist/Transformer - ??
3/01/14
2014 SPARC Annual Meeting
3
4. Perhaps the first question to ask is:
What is the endpoint?
3/01/14
2014 SPARC Annual Meeting
4
5. Where Is Open Going?
3/01/14
2014 SPARC Annual Meeting
5
6. What Does The Democratization of
Science Imply?
• The obvious – participation by all
• Not so obvious
– More scrutiny
– New types of rewards
– More equal value placed on all participants
– The removal of artificial boundaries that corral
knowledge (through power and resources) within
silos that do not make sense as complexity
increases
3/01/14
2014 SPARC Annual Meeting
6
7. Consider some personal examples that
illustrate these implications
3/01/14
2014 SPARC Annual Meeting
7
8. More Scrutiny – Highlights
Lack of Reproducibility
• I can’t immediately reproduce the research
in my own laboratory:
• It took an estimated 280 hours for an average user
to approximately reproduce the paper
• Workflows are maturing and becoming helpful
• Data and software versions and accessibility
prevent exact reproducibility
Daniel Garijo et al. 2013 Quantifying Reproducibility in Computational Biology:
The Case of the Tuberculosis Drugome PLOS ONE 8(11) e80278 .
3/01/14
2014 SPARC Annual Meeting
8
9. Why New Types of
Rewards?
• I have a paper with 16,000 citations that no
one has ever read
• I have papers in PLOS ONE that have more
citations than ones in PNAS
• I have data sets I am proud of few places to
put them
• I edited a journal but it did not count for much
3/01/14
2014 SPARC Annual Meeting
9
10. Equal Value Placed
on Participants
• The UC System has Research Scientists (RS) &
Project Scientists (PS) as well as tenured
faculty – RS/PS have no senate rights yet:
– RS/PS frequently teach
– RS/PS frequently have more grant money
– RS/PS typically perform more service
– RS/PS are most of the data scientists you know
3/01/14
2014 SPARC Annual Meeting
10
12. Institutional Boundaries
• Academia – Departments of
physics, math, biology, chemistry etc. persist
but scholars rarely confine themselves to
these disciplines
• NIH – 27 institutes and centers, many
dedicated to specific diseases & conditions –
yet a specific gene may transcend ICs
3/01/14
2014 SPARC Annual Meeting
12
13. I have argued that the democratization
of science is compelling
I have not argued for the value of open
access to this picture because you
know that already
3/01/14
2014 SPARC Annual Meeting
13
14. I Would Also Argue That This Process is
About to Accelerate
• Others provide a more
compelling argument:
–
–
–
–
3/01/14
2014 SPARC Annual Meeting
Google car
3D printers
Waze
Robotics
14
15. From the Second Machine Age
From: The Second Machine Age: Work, Progress, and Prosperity in a
Time of Brilliant Technologies by Erik Brynjolfsson & Andrew McAfee
3/01/14
2014 SPARC Annual Meeting
15
16. So what will this look like for an
institution?
Institutions will become digital enterprises
3/01/14
2014 SPARC Annual Meeting
16
17. Components of The Academic Digital
Enterprise
• Consists of digital assets
– E.g. datasets, papers, software, lab notes
• Each asset is uniquely identified and has
provenance, including access control
– E.g. publishing simply involves changing the access
control
• Digital assets are interoperable across the
enterprise
3/01/14
2014 SPARC Annual Meeting
17
18. Life in the Academic Digital Enterprise
•
Jane scores extremely well in parts of her graduate on-line neurology class. Neurology
professors, whose research profiles are on-line and well described, are automatically notified of
Jane’s potential based on a computer analysis of her scores against the background interests of the
neuroscience professors. Consequently, professor Smith interviews Jane and offers her a research
rotation. During the rotation she enters details of her experiments related to understanding a
widespread neurodegenerative disease in an on-line laboratory notebook kept in a shared on-line
research space – an institutional resource where stakeholders provide metadata, including access
rights and provenance beyond that available in a commercial offering. According to Jane’s
preferences, the underlying computer system may automatically bring to Jane’s attention Jack, a
graduate student in the chemistry department whose notebook reveals he is working on using
bacteria for purposes of toxic waste cleanup. Why the connection? They reference the same gene a
number of times in their notes, which is of interest to two very different disciplines – neurology and
environmental sciences. In the analog academic health center they would never have discovered
each other, but thanks to the Digital Enterprise, pooled knowledge can lead to a distinct advantage.
The collaboration results in the discovery of a homologous human gene product as a putative target
in treating the neurodegenerative disorder. A new chemical entity is developed and patented.
Accordingly, by automatically matching details of the innovation with biotech companies worldwide
that might have potential interest, a licensee is found. The licensee hires Jack to continue working
on the project. Jane joins Joe’s laboratory, and he hires another student using the revenue from the
license. The research continues and leads to a federal grant award. The students are
employed, further research is supported and in time societal benefit arises from the technology.
From What Big Data Means to Me JAMIA 2014 21:194
3/01/14
2014 SPARC Annual Meeting
18
19. Let us now turn to the biomedical
sciences and look at what might
happen if the NIH were to become a
digital enterprise
3/01/14
2014 SPARC Annual Meeting
19
20. As of Today
• Assumed the role of Associate Director for
Data Science (ADDS):
NIH Data Science Point Person
Reports to NIH Director
Lead the BD2K initiative
Trans-NIH responsibilities for data
Eric Green, Acting
[Modified slide from Eric Green]
3/01/14
2014 SPARC Annual Meeting
20
21. The focus is on data, but I do not think
that can be separated from the
research life cycle as you will see…
3/01/14
2014 SPARC Annual Meeting
21
22. I Want To Engage With This
Community To:
• Help me understand the most pressing
problems
• Begin a dialog
• Inform you of what I am currently thinking
• Inform you of relevant NIH initiatives that are
underway or planned
• Have you change my thinking appropriately
3/01/14
2014 SPARC Annual Meeting
22
23. The NIH process thus far …
An external advisory group provided a
valuable blueprint for what should be
done
acd.od.nih.gov/diwg.htm
3/01/14
2014 SPARC Annual Meeting
23
24. Blueprint Recommendations
• Promote central and federated catalogs
– Establish minimal metadata framework
– Tools to facilitate data sharing
– Elaborate on existing data sharing policies
• Support methods and applications
– Fund all phases of software development
– Leverage lessons from National Centers
• Training
– More funding
– Enhance review of training apps
– Quantitative component to all awards
• On campus IT strategic plan
– Catalog of existing tools
– Informatics laboratory
– Ditto big data
• Sustainable funding commitment
3/01/14
2014 SPARC Annual Meeting
acd.od.nih.gov/diwg.htm
24
25. Let me outline in general terms where
I see my effort being spent going
forward
http://pebourne.wordpress.com/2013/12/
3/01/14
2014 SPARC Annual Meeting
25
26. ADDS Initial Thrusts
•
•
•
•
•
•
•
•
How data are currently being used
Lightweight metadata standards
Data & software registries
Expanded policies on data sharing, open source
software
Training programs & reward systems
Institutional incentives
Private sector incentives
Data centers serving community needs
3/01/14
2014 SPARC Annual Meeting
26
27. ADDS Initial Thrusts
•
•
•
•
•
•
•
•
How data are currently being used
Lightweight metadata standards
Data & software registries
Expanded policies on data sharing, open source
software
Training programs & reward systems
Institutional incentives
Private sector incentives
Data centers serving community needs
3/01/14
2014 SPARC Annual Meeting
27
28. We need to start by asking, how are
we using the data now?
Only then can we make rational
decisions about data – large or small
3/01/14
2014 SPARC Annual Meeting
28
29. How Data Are Used
Structure Summary page activity for
H1N1 Influenza related structures
Jan. 2008
Jul. 2008
* http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm
Jan. 2009
Jul. 2009
Jan. 2010
Jul. 2010
3B7E: Neuraminidase of A/Brevig Mission/1/1918
H1N1 strain in complex with zanamivir
1RUZ: 1918 H1 Hemagglutinin
3/01/14
29
2014 SPARC Annual Meeting
[Andreas Prlic]
30. We Need to Learn from Industries
Whose Livelihood Addresses the
Question of Use
3/01/14
2014 SPARC Annual Meeting
30
31. ADDS Initial Thrusts – More Detail
• Now:
–
–
–
–
–
Data centers (under review)
Data science training grants (call out)
Pilot data catalog consortium (call out)
Genomic Data Sharing Policy (being finalized)
Piloting “NIH-drive”
• What Is Planned:
– Extended public-private programs specifically for data science
activities
– Interagency activities
– International exchange programs
– Cold Spring Harbor-like training facilities – by-coastal?
– Programs for better data descriptions
– Reward institutions/communities
– Policies to get clinical trial data into the public domain
3/01/14
2014 SPARC Annual Meeting
31
32. ADDS Initial Thrusts – More Detail
• Now:
–
–
–
–
–
Data centers (under review)
Data science training grants (call out)
Pilot data catalog consortium (call out)
Genomic Data Sharing Policy (being finalized)
Piloting “NIH-drive”
• What Is Planned:
– Extended public-private programs specifically for data science
activities
– Interagency activities
– International exchange programs
– Cold Spring Harbor-like training facilities – by-coastal?
– Programs for better data descriptions
– Reward institutions/communities
– Policies to get clinical trial data into the public domain
3/01/14
2014 SPARC Annual Meeting
32
33. Pilot NIH-Drive
• Investigator A from the NCI makes frequent
reference to the over expression of genes x and y.
• Investigator B from the NHLBI makes frequent
reference to the under expression of genes x and
y
• Automatic notification of a potential common
interest before publication or database deposition
3/01/14
2014 SPARC Annual Meeting
33
34. Let me come back to the big picture..
3/01/14
2014 SPARC Annual Meeting
34
35. First consider what we do (or wish we
could do) every day:
We take actions on digital data
increasingly across boundaries
3/01/14
2014 SPARC Annual Meeting
35
36. Actions on Biomedical Data Implies:
•
•
•
•
•
•
•
•
•
Insuring data quality and hence trust
Making data sustainable
Making data open and accessible
Making data findable
Providing suitable metadata and annotation
Making data queryable
Making data analyzable
Presenting data as to maximize its value
Rewarding good data practices
3/01/14
2014 SPARC Annual Meeting
36
37. Actions on Biomedical Data Implies:
•
•
•
•
•
•
•
•
•
Insuring data quality and hence trust
Making data sustainable
Making data open and accessible
Making data findable
Providing suitable metadata and annotation
Making data queryable
Making data analyzable
Presenting data as to maximize its value
Rewarding good data practices
3/01/14
2014 SPARC Annual Meeting
37
38. Boundaries on Biomedical Data
Implies:
• Working across biological scales
• Working across biomedical disciplines
• Working across basic and clinical research and
practice
• Working across institutional boundaries
• Working across public and private sectors
• Working across national and international
borders
• Working across funding agencies
3/01/14
2014 SPARC Annual Meeting
38
39. Boundaries on Biomedical Data
Implies:
• Working across biological scales
• Working across biomedical disciplines
• Working across basic and clinical research and
practice
• Working across institutional boundaries
• Working across public and private sectors
• Working across national and international
borders
• Working across funding agencies
3/01/14
2014 SPARC Annual Meeting
39
40. These issues have been around a long
time
The good news is that “Big Data” has
bought more attention to the problem
3/01/14
2014 SPARC Annual Meeting
40
41. What Are Big Data?
• Large datasets from high throughput
experiments
• Large numbers of small datasets
• Data which are “ill-formed”
• The why (causality) is replaced by the what
• A signal that a fundamental change is taking
place – a tipping point?
3/01/14
2014 SPARC Annual Meeting
41
42. The NIH is Starting to Think About the
Digital Enterprise, Witness…
bd2k.nih.gov
3/01/14
2014 SPARC Annual Meeting
42
43. What Will Define the NIH Digital
Enterprise?
•
•
•
•
•
•
•
•
•
NCBI/NLM
Trans-NIH collaboration – a culture change
Long-term NIH strategic planning
The BD2K Initiative
A “hub” of data science activities
International cooperation
Interagency cooperation
Data sharing policies
External forces….
3/01/14
2014 SPARC Annual Meeting
43
44. This is great, but what will it look like
to the end user and to those
interested in scholarly
communication?
3/01/14
2014 SPARC Annual Meeting
44
45. One Possible End Point
0. Full text of PLoS papers stored
in a database
4. The composite view has
links to pertinent blocks
of literature text and back to the PDB
4.
1.
1. A link brings up figures
from the paper
2.
3/01/14
3. A composite view of
journal and database
content results
3.
2. Clicking the paper figure retrieves
data from the PDB which is
analyzed
1. User clicks on thumbnail
2. Metadata and a
webservices call provide
a renderable image that
can be annotated
3. Selecting a features
provides a
database/literature
mashup
4. That leads to new
papers
PLoS Comp. Biol. 2005 1(3) e34
45
46. To get to that end point we have to
consider the complete research
lifecycle
3/01/14
2014 SPARC Annual Meeting
46
47. The Research Life Cycle will
Persist
IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
3/01/14
2014 SPARC Annual Meeting
47
48. Tools and Resources Will Continue
To Be Developed
Authoring
Tools
Lab
Notebooks
Data
Capture
Analysis
Tools
Software
Scholarly
Communication
Visualization
IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
3/01/14
2014 SPARC Annual Meeting
48
49. Those Elements of the Research Life
Cycle will Become More Interconnected
Authoring Around a Common Framework
Tools
Lab
Notebooks
Data
Capture
Software
Analysis
Tools
Scholarly
Communication
Visualization
IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
3/01/14
2014 SPARC Annual Meeting
49
50. New/Extended Support Structures Will
Emerge
Authoring
Tools
Data
Capture
Lab
Notebooks
Analysis
Tools
Scholarly
Communication
Software
Visualization
IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
Commercial &
Public Tools
DisciplineBased Metadata
Standards
Community Portals
Git-like
Resources
By Discipline
Data Journals
New Reward
Systems
Training
Institutional Repositories
3/01/14
2014 SPARC Repositories
CommercialAnnual Meeting
50
51. We Have a Ways to Go
Authoring
Tools
Data
Capture
Lab
Notebooks
Software
Analysis
Tools
Scholarly
Communication
Visualization
IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
Commercial &
Public Tools
DisciplineBased Metadata
Standards
Community Portals
Git-like
Resources
By Discipline
Data Journals
New Reward
Systems
Training
Institutional Repositories
3/01/14
2014 SPARC Repositories
CommercialAnnual Meeting
51
52. Where is Open Going?
• Slowly towards the democratization of science
• Which changes how institutions think and
operate – they become digital enterprises
• This in turn impacts the scholarly research
lifecycle and hence scholarly communication
• I will be working to help the NIH be a leading
institution in this change
3/01/14
2014 SPARC Annual Meeting
52