Internal NIH Seminar to the BISTI Team on some early thoughts from the Associate Director for Data Science (ADDS). These ideas are for discussion only and in no way reflect what might happen subsequently. Presented April 1, 2014 (the date is purely a coincidence).
The Path to Open Science with Illustrations from Computational Biology - A presentation made at the Microsoft 2011 Latin America Faculty Summit Cartagena, Columbia, May 18, 2011.
Internal NIH Seminar to the BISTI Team on some early thoughts from the Associate Director for Data Science (ADDS). These ideas are for discussion only and in no way reflect what might happen subsequently. Presented April 1, 2014 (the date is purely a coincidence).
The Path to Open Science with Illustrations from Computational Biology - A presentation made at the Microsoft 2011 Latin America Faculty Summit Cartagena, Columbia, May 18, 2011.
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...GigaScience, BGI Hong Kong
Scott Edmunds talk at the 7th Internation Conference on Genomics: "Channeling the Deluge: Reproducibility & Data Dissemination in the “Big-Data” Era. ICG7, Hong Kong 1st December 2012
"
Abstract: http://j.mp/1MhWWei
Healthcare applications now have the ability to exploit big data in all its complexity. A crucial challenge is to achieve interoperability or integration so that a variety of content from diverse physical (IoT)- cyber (web-based)- and social sources, with diverse formats and modality (text, image, video), can be used in analysis, insight, and decision-making. At Kno.e.sis, an Ohio Center of Excellence in BioHealth Innovation, we have a variety of large, collaborative healthcare/clinical/biomedical projects, all involving domain experts and end-users, and access to real world data that include: clinical/EMR data (of individual patients and that related to public health), data from a variety of sensors (IoT) on and around patients measuring real-time physiological and environmental observations), social data (Twitter, Web forums, PatientsLikeMe), Web search logs, etc. Key projects include: Prescription drug abuse online-surveillance and epidemiology (PREDOSE), Social media analysis to monitor cannabis and synthetic cannabinoid use (eDrugTrends), Modeling Social Behavior for Healthcare Utilization in Depression, Medical Information Decision Assistant and Support (MIDAS) with application to musculoskeletal issues, kHealth: A Semantic Approach to Proactive, Personalized Asthma Management Using Multimodal Sensing (also for Dementia), and Cardiology Semantic Analysis System (with applications to Computer Assisted Coding and Computerized Document Improvement).
This talk will review how ontologies or knowledge graphs play a central role in supporting semantic filtering, interoperability and integration (including the issues such as disambiguation), reasoning and decision-making in all our health-centric research and applications. Additional relevant information is at the speaker’s HCLS page. http://knoesis.org/amit/hcls
Big Data in Biomedicine – An NIH PerspectivePhilip Bourne
Keynote at the IEEE International Conference on Bioinformatics and Biomedicine, Washington DC, November 10, 2015.
https://cci.drexel.edu/ieeebibm/bibm2015/
Genome sharing projects around the world nijmegen oct 29 - 2015Fiona Nielsen
Genome sharing projects across the world
Did you ever wonder what happened to the exponential increase in genome sequencing data? It is out there around the world and a lot of it is consented for research use. This means that if you just know where to find the data, you can potentially analyse gigabytes of data to power your research.
In this talk Fiona will present community genome initiatives, the genome sharing projects across the world, how you can benefit from this wealth of data in your work, and how you can boost your academic career by sharing and collaboration.
by Fiona Nielsen, Founder and CEO of DNAdigest and Repositive
With a background in software development Fiona pursued her career in bioinformatics research at Radboud University Nijmegen. Now a scientist-turned-entrepreneur Fiona founded DNAdigest and its social enterprise spin-out Repositive Ltd. Both the charity and company focus on efficient and ethical sharing of genetics data for research to accelerate diagnostics and cures for genetic diseases.
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALSMicah Altman
This talk, is part of the MIT Program on Information Science brown bag series (http://informatics.mit.edu)
This talk discusses findings from an analysis of data sharing and citation policies in Open Access journals and describes a set of novel tools for open data publication in open access journal workflows. Bring your lunch and enjoy a discussion fit for scholars, Open Access fans, and students alike.
Dr Micah Altman is Director of Research and Head/Scientist, Program on Information Science for the MIT Libraries, at the Massachusetts Institute of Technology.
Open science and the individual researcherBram Zandbelt
Slides for the Feb 8, 2017 lab meeting of Roshan Cools' Motivation & Cognitive Control group (Donders Institute), discussing the following paper:
McKiernan, E. C., Bourne, P. E., Brown, C. T., Buck, S., Kenall, A., Lin, J., … Yarkoni, T. (2016). How open science helps researchers succeed. eLife, 5, e16800. https://doi.org/10.7554/eLife.16800.
SCUP 2016 Mid-Atlantic Symposium: Big Data: Academy Research, Facilities, and Infrastructure Implications and Opportunities. John Hopkins, May 13, 2016
Journal Club - Best Practices for Scientific ComputingBram Zandbelt
Journal Club presentation for Cools lab at Donders Institute, Radboud University, Nijmegen, the Netherlands
Date: October 28, 2015
Paper:
Wilson, G., Aruliah, D. A., Brown, C. T., Hong, N. P. C., Davis, M., Guy, R. T., ... & Wilson, P. (2014). Best practices for scientific computing. PLoS Biology, 12(1), e1001745.
Research in the time of Covid: Surveying impacts on Early Career ResearchersRebecca Grant
Based on a survey of over 4,500 researchers published in the white paper The State of Open Data 2020, this session will explore the impacts of the pandemic on early career reearchers (ECRs), their research practice, and how they interact with open data. We will discuss the specific challenges reported by ECRs, as well as the gaps in training and support that they have identified that would encourage their sharing and reuse of research data.
Presentation at the E-ARMA conference 2021.
2014 CrossRef Annual Meeting Keynote: Ways and Needs to Promote Rapid Data Sh...Crossref
Keynote address: "Ways and Needs to Promote Rapid Data Sharing" by Laurie Goodman of GigaScience.
Data is the base upon which all scientific discoveries are built, and data availability speeds the rate at which discoveries are made. Given that the overall goal for research is to improve human health and our environment, waiting to release data until after the first publication (sometimes taking years) is unacceptable. There are myriad issues that impede researchers from openly, and most importantly, rapidly sharing data, including lack of incentives: no credit, limited funding benefits, and little impact on career advancement; and cultural issues: the fear of being scooped. However, scientific publishers —the communicators of science and a key mechanism by which a researcher’s productivity is measured— can, and should, play a central role in promoting data sharing. Data citation and publication are just some of the ways we can support and encourage researchers who share data. Here, I will provide examples to help make clear the need for publishers to play an active role in this process and provide potential ways to facilitate our ability to promote open and rapid data sharing. This is not easy; but it is essential.
Scott Edmunds: Channeling the Deluge: Reproducibility & Data Dissemination in...GigaScience, BGI Hong Kong
Scott Edmunds talk at the 7th Internation Conference on Genomics: "Channeling the Deluge: Reproducibility & Data Dissemination in the “Big-Data” Era. ICG7, Hong Kong 1st December 2012
"
Abstract: http://j.mp/1MhWWei
Healthcare applications now have the ability to exploit big data in all its complexity. A crucial challenge is to achieve interoperability or integration so that a variety of content from diverse physical (IoT)- cyber (web-based)- and social sources, with diverse formats and modality (text, image, video), can be used in analysis, insight, and decision-making. At Kno.e.sis, an Ohio Center of Excellence in BioHealth Innovation, we have a variety of large, collaborative healthcare/clinical/biomedical projects, all involving domain experts and end-users, and access to real world data that include: clinical/EMR data (of individual patients and that related to public health), data from a variety of sensors (IoT) on and around patients measuring real-time physiological and environmental observations), social data (Twitter, Web forums, PatientsLikeMe), Web search logs, etc. Key projects include: Prescription drug abuse online-surveillance and epidemiology (PREDOSE), Social media analysis to monitor cannabis and synthetic cannabinoid use (eDrugTrends), Modeling Social Behavior for Healthcare Utilization in Depression, Medical Information Decision Assistant and Support (MIDAS) with application to musculoskeletal issues, kHealth: A Semantic Approach to Proactive, Personalized Asthma Management Using Multimodal Sensing (also for Dementia), and Cardiology Semantic Analysis System (with applications to Computer Assisted Coding and Computerized Document Improvement).
This talk will review how ontologies or knowledge graphs play a central role in supporting semantic filtering, interoperability and integration (including the issues such as disambiguation), reasoning and decision-making in all our health-centric research and applications. Additional relevant information is at the speaker’s HCLS page. http://knoesis.org/amit/hcls
Big Data in Biomedicine – An NIH PerspectivePhilip Bourne
Keynote at the IEEE International Conference on Bioinformatics and Biomedicine, Washington DC, November 10, 2015.
https://cci.drexel.edu/ieeebibm/bibm2015/
Genome sharing projects around the world nijmegen oct 29 - 2015Fiona Nielsen
Genome sharing projects across the world
Did you ever wonder what happened to the exponential increase in genome sequencing data? It is out there around the world and a lot of it is consented for research use. This means that if you just know where to find the data, you can potentially analyse gigabytes of data to power your research.
In this talk Fiona will present community genome initiatives, the genome sharing projects across the world, how you can benefit from this wealth of data in your work, and how you can boost your academic career by sharing and collaboration.
by Fiona Nielsen, Founder and CEO of DNAdigest and Repositive
With a background in software development Fiona pursued her career in bioinformatics research at Radboud University Nijmegen. Now a scientist-turned-entrepreneur Fiona founded DNAdigest and its social enterprise spin-out Repositive Ltd. Both the charity and company focus on efficient and ethical sharing of genetics data for research to accelerate diagnostics and cures for genetic diseases.
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALSMicah Altman
This talk, is part of the MIT Program on Information Science brown bag series (http://informatics.mit.edu)
This talk discusses findings from an analysis of data sharing and citation policies in Open Access journals and describes a set of novel tools for open data publication in open access journal workflows. Bring your lunch and enjoy a discussion fit for scholars, Open Access fans, and students alike.
Dr Micah Altman is Director of Research and Head/Scientist, Program on Information Science for the MIT Libraries, at the Massachusetts Institute of Technology.
Open science and the individual researcherBram Zandbelt
Slides for the Feb 8, 2017 lab meeting of Roshan Cools' Motivation & Cognitive Control group (Donders Institute), discussing the following paper:
McKiernan, E. C., Bourne, P. E., Brown, C. T., Buck, S., Kenall, A., Lin, J., … Yarkoni, T. (2016). How open science helps researchers succeed. eLife, 5, e16800. https://doi.org/10.7554/eLife.16800.
SCUP 2016 Mid-Atlantic Symposium: Big Data: Academy Research, Facilities, and Infrastructure Implications and Opportunities. John Hopkins, May 13, 2016
Journal Club - Best Practices for Scientific ComputingBram Zandbelt
Journal Club presentation for Cools lab at Donders Institute, Radboud University, Nijmegen, the Netherlands
Date: October 28, 2015
Paper:
Wilson, G., Aruliah, D. A., Brown, C. T., Hong, N. P. C., Davis, M., Guy, R. T., ... & Wilson, P. (2014). Best practices for scientific computing. PLoS Biology, 12(1), e1001745.
Research in the time of Covid: Surveying impacts on Early Career ResearchersRebecca Grant
Based on a survey of over 4,500 researchers published in the white paper The State of Open Data 2020, this session will explore the impacts of the pandemic on early career reearchers (ECRs), their research practice, and how they interact with open data. We will discuss the specific challenges reported by ECRs, as well as the gaps in training and support that they have identified that would encourage their sharing and reuse of research data.
Presentation at the E-ARMA conference 2021.
2014 CrossRef Annual Meeting Keynote: Ways and Needs to Promote Rapid Data Sh...Crossref
Keynote address: "Ways and Needs to Promote Rapid Data Sharing" by Laurie Goodman of GigaScience.
Data is the base upon which all scientific discoveries are built, and data availability speeds the rate at which discoveries are made. Given that the overall goal for research is to improve human health and our environment, waiting to release data until after the first publication (sometimes taking years) is unacceptable. There are myriad issues that impede researchers from openly, and most importantly, rapidly sharing data, including lack of incentives: no credit, limited funding benefits, and little impact on career advancement; and cultural issues: the fear of being scooped. However, scientific publishers —the communicators of science and a key mechanism by which a researcher’s productivity is measured— can, and should, play a central role in promoting data sharing. Data citation and publication are just some of the ways we can support and encourage researchers who share data. Here, I will provide examples to help make clear the need for publishers to play an active role in this process and provide potential ways to facilitate our ability to promote open and rapid data sharing. This is not easy; but it is essential.
Ten Simple Rules for Building and Maintaining a Scientific ReputationPhilip Bourne
Originally a 2011 article in PLOS Comp Biol, http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002108 presented as a lecture to the Morgridge Institute for Research, Madison, WI on December 14, 2016
Recomendations for infrastructure and incentives for open science, presented to the Research Data Alliance 6th Plenary. Presenter: William Gunn, Director of Scholarly Communications for Mendeley.
Scott Edmunds talk at G3 (Great GigaScience & Galaxy) workshop: Open Data: th...GigaScience, BGI Hong Kong
Scott Edmunds talk at G3 (Great GigaScience & Galaxy) workshop: Open Data: the reproducibility crisis, and the need for transparency. Melbourne University 19th September 2014
Presented in the workshop session "What Bioinformaticians Need to Know about Digital Publishing Beyond the PDF" at ISMB 2013 in Berlin. https://www.iscb.org/cms_addon/conferences/ismbeccb2013/workshops.php
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...Carole Goble
Keynote given by Carole Goble on 23rd July 2013 at ISMB/ECCB 2013
http://www.iscb.org/ismbeccb2013
How could we evaluate research and researchers? Reproducibility underpins the scientific method: at least in principle if not practice. The willing exchange of results and the transparent conduct of research can only be expected up to a point in a competitive environment. Contributions to science are acknowledged, but not if the credit is for data curation or software. From a bioinformatics view point, how far could our results be reproducible before the pain is just too high? Is open science a dangerous, utopian vision or a legitimate, feasible expectation? How do we move bioinformatics from one where results are post-hoc "made reproducible", to pre-hoc "born reproducible"? And why, in our computational information age, do we communicate results through fragmented, fixed documents rather than cohesive, versioned releases? I will explore these questions drawing on 20 years of experience in both the development of technical infrastructure for Life Science and the social infrastructure in which Life Science operates.
Scholarly Communication for Bioinformatics StudentsPhilip Bourne
Presentation made to the incoming bioinformatics and systems biology students at UCSD on how they could get involved in changing scholarly communication. Given February 28, 2011
The slides that will accompany my live webcast for OpenCon 2014 attendees, all about open data in research. The benefits, the how to (both legally & technically), examples, pitfalls, and the future of open research data.
Democratising biodiversity and genomics research: open and citizen science to...GigaScience, BGI Hong Kong
Scott Edmunds at the China National GeneBank Youth Biodiversity MegaData Forum: Democratising biodiversity and genomics research: open and citizen science to build trust and fill the data gaps. 18th December 2018
Similar to Biomedical Research as Part of the Digital Enterprise (20)
Presented online as part of the NASM series in Advancing Drug Discovery see https://www.nationalacademies.org/event/40883_09-2023_advancing-drug-discovery-data-science-meets-drug-discovery
For a panel discussion at the Associate Research Libraries Spring meeting April 27, 2022, Montreal https://www.arl.org/schedule-for-spring-2022-association-meeting/
Frontiers of Computing at the Cellular and Molecular ScalesPhilip Bourne
3 basic points when establishing a new biomedical initiative. Presented at Frontiers of Computing in Health and Society, George Mason University, September 21, 2021.
NITRD Big Data Interagency Working Group Workshop: Pioneering the Future of Federally Supported Data Repositories Jan 13, 2021 - Opening comments on where we are and one suggestion of where we might go with an International Data Science Institute (IDSI) - A blue sky view.
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxEduSkills OECD
Andreas Schleicher presents at the OECD webinar ‘Digital devices in schools: detrimental distraction or secret to success?’ on 27 May 2024. The presentation was based on findings from PISA 2022 results and the webinar helped launch the PISA in Focus ‘Managing screen time: How to protect and equip students against distraction’ https://www.oecd-ilibrary.org/education/managing-screen-time_7c225af4-en and the OECD Education Policy Perspective ‘Students, digital devices and success’ can be found here - https://oe.cd/il/5yV
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Ethnobotany and Ethnopharmacology:
Ethnobotany in herbal drug evaluation,
Impact of Ethnobotany in traditional medicine,
New development in herbals,
Bio-prospecting tools for drug discovery,
Role of Ethnopharmacology in drug evaluation,
Reverse Pharmacology.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Thesis Statement for students diagnonsed withADHD.ppt
Biomedical Research as Part of the Digital Enterprise
1. Biomedical Research as Part of the Digital
Enterprise
Philip E. Bourne Ph.D.
Associate Director for Data Science
National Institutes of Health
2. Disclaimer: I only started March 3,
2014
…but I had been thinking about this prior to my
appointment
3. Let me start with a few factoids to get
the ball rolling…
4. The Story of Meredith
http://fora.tv/2012/04/20/Congress_Unplugged_
Phil_Bourne
5. 1. The Era of Open Has The Potential
to Deinstitutionalize & Democratize
Daniel Hulshizer/Associated Press
6. 1. The Era of Open Has The Potential
to Deinstitutionalize & Democratize
Daniel Hulshizer/Associated Press
7. 2. I can’t reproduce research from my
own laboratory?
Daniel Garijo et al. 2013 Quantifying Reproducibility in Computational Biology:
The Case of the Tuberculosis Drugome PLOS ONE 8(11) e80278 .
9. Characteristics of the Original and
Current Experiment
Original and Current:
– Purely in silico
– Uses a combination of public databases and
open source software by us and others
Original:
– http://funsite.sdsc.edu/drugome/TB/
Current:
– Recast in the Wings workflow system
Daniel Garijo et al. 2013 Quantifying Reproducibility in Computational Biology:
The Case of the Tuberculosis Drugome PLOS ONE 8(11) e80278 .
10. Considered the Ability to Reproduce
by Four Classes of User
REP-AUTHOR – original author of the work
REP-EXPERT – domain expert – can reproduce even
with incomplete methods described
REP-NOVICE – basic domain (bioinformatics) expertise
REP-MINIMAL – researcher with no domain expertise
Garijo et al 2013 PLOS ONE 8(11): e80278
11. A Conceptual Overview of the Method
Should Be Mandatory
Garijo et al 2013 PLOS ONE 8(11): e80278
12. Time to Reproduce the Method
Garijo et al 2013 PLOS ONE 8(11): e80278
13. 2. Its not that we could not reproduce
the work, but the effort involved was
substantial
Any graduate student could tell you
this and little has changed in 40 years
Perhaps it is time we did better?
15. 4. We don’t know
enough about how
existing data are used
* http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm
Jan. 2008 Jan. 2009 Jan. 2010Jul. 2009Jul. 2008 Jul. 2010
1RUZ: 1918 H1 Hemagglutinin
Structure Summary page activity for
H1N1 Influenza related structures
3B7E: Neuraminidase of A/Brevig Mission/1/1918
H1N1 strain in complex with zanamivir
[Andreas Prlic]
16. We Need to Learn from Industries Whose
Livelihood Addresses the Question of Use
17. 5. Some would argue we are at an
inflexion point for change
Evidence:
– Google car
– 3D printers
– Waze
– Robotics
18. From the Second Machine Age
From: The Second Machine Age: Work, Progress, and
Prosperity in a Time of Brilliant Technologies by Erik
Brynjolfsson & Andrew McAfee
19. 6. Scholarship is broken
I have a paper with 16,000 citations that no one has
ever read
I have papers in PLOS ONE that have more citations
than ones in PNAS
I have data sets I am proud of few places to put
them
I edited a journal but it did not count for much
22. I cast the solutions in a vision …
something I call the digital enterprise
Any institution is a candidate as a digital
enterprise, but lets explore it in the context
of the academic medical center
23. Components of The Academic Digital
Enterprise
Consists of digital assets
– E.g. datasets, papers, software, lab notes
Each asset is uniquely identified and has provenance,
including access control
– E.g. publishing simply involves changing the access control
Digital assets are interoperable across the enterprise
24. Life in the Academic Digital Enterprise
Jane scores extremely well in parts of her graduate on-line neurology class.
Neurology professors, whose research profiles are on-line and well described, are
automatically notified of Jane’s potential based on a computer analysis of her scores
against the background interests of the neuroscience professors. Consequently,
professor Smith interviews Jane and offers her a research rotation. During the
rotation she enters details of her experiments related to understanding a widespread
neurodegenerative disease in an on-line laboratory notebook kept in a shared on-line
research space – an institutional resource where stakeholders provide metadata,
including access rights and provenance beyond that available in a commercial
offering. According to Jane’s preferences, the underlying computer system may
automatically bring to Jane’s attention Jack, a graduate student in the chemistry
department whose notebook reveals he is working on using bacteria for purposes of
toxic waste cleanup. Why the connection? They reference the same gene a number
of times in their notes, which is of interest to two very different disciplines – neurology
and environmental sciences. In the analog academic health center they would never
have discovered each other, but thanks to the Digital Enterprise, pooled knowledge
can lead to a distinct advantage. The collaboration results in the discovery of a
homologous human gene product as a putative target in treating the
neurodegenerative disorder. A new chemical entity is developed and patented.
Accordingly, by automatically matching details of the innovation with biotech
companies worldwide that might have potential interest, a licensee is found. The
licensee hires Jack to continue working on the project. Jane joins Joe’s laboratory,
and he hires another student using the revenue from the license. The research
continues and leads to a federal grant award. The students are employed, further
research is supported and in time societal benefit arises from the technology.
From What Big Data Means to Me JAMIA 2014 21:194
25. Solution: Break Down the Silos
New policies,
regulations e.g. data
sharing
Economic drivers
The promise of shared
data
26. Solution: Sustainability
The How of Data Sharing
More credit to the data scientists
Change to funding models
Public/Private partnerships
Interagency cooperation
International cooperation
Better evaluation and more informed decisions about
existing and proposed resources – How are current
data being used?
Role of institutional repositories – reward institutions
rather than PIs
27. Solution: Discoverability
Calls for data and software registries (e.g., DDI)
Data commons (NIH drive?)
More clinical trial data in the public domain
Facilitate accessibility and hence access to clinical
data
28. Solution: Training
Calls out for training grants – new and as
supplements to existing training efforts
Regional training centers (cf Cold Spring Harbor)?
29. These problems and potential
solutions have been around a
long time
The good news is that “Big Data”
has bought more attention to the
problem
30. What Are Big Data?
Large datasets from high throughput experiments
Large numbers of small datasets
Data which are “ill-formed”
The why (causality) is replaced by the what
A signal that a fundamental change is taking place –
a tipping point?
31. The NIH is Starting to Think About the
Digital Enterprise, Witness…
You will hear all about
BD2K from:
– Jennie Larkin
– Warren Kibbe
– Dawei Lin
bd2k.nih.gov
33. 1. A link brings up figures
from the paper
0. Full text of PLoS papers stored
in a database
2. Clicking the paper figure retrieves
data from the PDB which is
analyzed
3. A composite view of
journal and database
content results
One Possible End Point
1. User clicks on thumbnail
2. Metadata and a
webservices call provide
a renderable image that
can be annotated
3. Selecting a features
provides a
database/literature
mashup
4. That leads to new
papers
4. The composite view has
links to pertinent blocks
of literature text and back to the PDB
1.
2.
3.
4.
PLoS Comp. Biol. 2005 1(3) e34
34. To get to that end point we have to
consider the complete research lifecycle
35. The Research Life Cycle will Persist
IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
36. Tools and Resources Will Continue To
Be Developed
IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
Authoring
Tools
Lab
Notebooks
Data
Capture
Software
Analysis
Tools
Visualization
Scholarly
Communication
37. Those Elements of the Research Life Cycle will
Become More Interconnected Around a Common
Framework
IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
Authoring
Tools
Lab
Notebooks
Data
Capture
Software
Analysis
Tools
Visualization
Scholarly
Communication
38. New/Extended Support Structures Will
Emerge
IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
Authoring
Tools
Lab
Notebooks
Data
Capture
Software
Analysis
Tools
Visualization
Scholarly
Communication
Commercial &
Public Tools
Git-like
Resources
By Discipline
Data Journals
Discipline-
Based Metadata
Standards
Community Portals
Institutional Repositories
New Reward
Systems
Commercial Repositories
Training
39. We Have a Ways to Go
IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
Authoring
Tools
Lab
Notebooks
Data
Capture
Software
Analysis
Tools
Visualization
Scholarly
Communication
Commercial &
Public Tools
Git-like
Resources
By Discipline
Data Journals
Discipline-
Based Metadata
Standards
Community Portals
Institutional Repositories
New Reward
Systems
Commercial Repositories
Training
Ioannidis JPA (2005) Why Most Published Research Findings Are False. PLoS Med 2(8): e124. doi:10.1371/journal.pmed.0020124
http://www.reuters.com/article/2012/03/28/us-science-cancer-idUSBRE82R12P20120328