Doing more with less resources used to be a situation common just for academic scientists. This is unfortunately still true for academics but we are seeing others facing many of the same challenges. With the squeeze on budgets and cost cutting resulting from recent worldwide economic challenges, the failure of many drugs to make it through the pipeline to the market, and the increasing costs associated with the drug development process, we are now seeing in the pharmaceutical industry a dramatic shift, perhaps belatedly, to have to accommodate similar challenges of doing more with less
There is an abundance of free online tools accessible to scientists and others that can be used for online networking, data sharing and measuring research impact. Despite this, few scientists know how these tools can be used or fail to take advantage of using them as an integrated pipeline to raise awareness of their research outputs. In this article, the authors describe their experiences with these tools and how they can make best use of them to make their scientific research generally more accessible, extending its reach beyond their own direct networks, and communicating their ideas to new audiences. These efforts have the potential to drive science by sparking new collaborations and interdisciplinary research projects that may lead to future publications, funding and commercial opportunities. The intent of this article is to: describe some of these freely accessible networking tools and affiliated products; demonstrate from our own experiences how they can be utilized effectively; and, inspire their adoption by new users for the benefit of science.
This is a presentation given at the Opal Events meeting ""Drug Discovery Partnerships: Filling the Pipeline". I was speaking in a session with Jean-Claude Bradley regarding "Pre-competitive Collaboration: Sharing Data to Increase Predictability". This presentation discussed some of the work we are doing on Open PHACTS. My thanks especially to Carole Goble, Lee Harland and Sean Ekins for their comments.
ContentMining for France and Europe; Lessons from 2 years in UKpetermurrayrust
I have spend 2 years carrying out Content Mining (aka Text and Data Mining) in the UK under the 2014 "Hargreaves" exception. This talk was given in Paris, to ADBU , after France had passed the law of the numeric Republique. I illustrate what worked in what did not and why and offer ideas to France and Europe
There is an abundance of free online tools accessible to scientists and others that can be used for online networking, data sharing and measuring research impact. Despite this, few scientists know how these tools can be used or fail to take advantage of using them as an integrated pipeline to raise awareness of their research outputs. In this article, the authors describe their experiences with these tools and how they can make best use of them to make their scientific research generally more accessible, extending its reach beyond their own direct networks, and communicating their ideas to new audiences. These efforts have the potential to drive science by sparking new collaborations and interdisciplinary research projects that may lead to future publications, funding and commercial opportunities. The intent of this article is to: describe some of these freely accessible networking tools and affiliated products; demonstrate from our own experiences how they can be utilized effectively; and, inspire their adoption by new users for the benefit of science.
This is a presentation given at the Opal Events meeting ""Drug Discovery Partnerships: Filling the Pipeline". I was speaking in a session with Jean-Claude Bradley regarding "Pre-competitive Collaboration: Sharing Data to Increase Predictability". This presentation discussed some of the work we are doing on Open PHACTS. My thanks especially to Carole Goble, Lee Harland and Sean Ekins for their comments.
ContentMining for France and Europe; Lessons from 2 years in UKpetermurrayrust
I have spend 2 years carrying out Content Mining (aka Text and Data Mining) in the UK under the 2014 "Hargreaves" exception. This talk was given in Paris, to ADBU , after France had passed the law of the numeric Republique. I illustrate what worked in what did not and why and offer ideas to France and Europe
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Michel Dumontier
In the quest to translate the results biomedical research into effective clinical applications, many are now trying to make sense of the large and rapidly growing amount of public biomedical data. However, substantial challenges exist in traversing the currently fragmented data landscape. In this talk, I will discuss our efforts to use Semantic Web technologies to facilitate biomedical research through the formulation, publication, integration, and exploration of facts, expert knowledge, and web services.
Towards Responsible Content Mining: A Cambridge perspectivepetermurrayrust
ContentMining (Text and Data Mining) is now legal in the UK for non-commercial research. Cambridge UK is a natural centre, with several components:
* a world-class University and Library
* many publishers, both Open Access and conventional
* a digital culture
* ContentMine - a leading proponent and practitioner of mining
Cambridge University Press welcomes content mining and invited PMR to give a talk there. He showed the technology and protocols and proposed a practical way forward in 2017
High throughput mining of the scholarly literature; talk at NIHpetermurrayrust
The scientific and medical literature contains huge amounts of valuable unused information. This talk shows how to discover it, extract, re-use and interpret it. Wikidata is presented as a key new tool and infrastructure. Everyone can become involved. However some of the barriers to use are sociopolitical and these are identified and discussed.
With its focus on investigating the basis for the sustained existence
of living systems, modern biology has always been a fertile, if not
challenging, domain for formal knowledge representation and automated
reasoning. With thousands of databases and hundreds of ontologies now
available, there is a salient opportunity to integrate these for
discovery. In this talk, I will discuss our efforts to build a rich
foundational network of ontology-annotated linked data, develop
methods to intelligently retrieve content of interest, uncover
significant biological associations, and pursue new avenues for drug
discovery. As the portfolio of Semantic Web technologies continue to
mature in terms of functionality, scalability, and an understanding of
how to maximize their value, researchers will be strategically poised
to pursue increasingly sophisticated KR projects aimed at improving
our overall understanding of human health and disease.
bio: Dr. Michel Dumontier is an Associate Professor of Medicine
(Biomedical Informatics) at Stanford University. His research aims to
find new treatments for rare and complex diseases. His research
interest lie in the publication, integration, and discovery of
scientific knowledge. Dr. Dumontier serves as a co-chair for the World
Wide Web Consortium Semantic Web in Health Care and Life Sciences
Interest Group (W3C HCLSIG) and is the Scientific Director for
Bio2RDF, a widely used open-source project to create and provide
linked data for life sciences.
Mining the scientific literature for plants and chemistrypetermurrayrust
ContentMine can read the daily scientific literature and extract facts. This talk was given to the OpenPlant project - with whom ContentMine collaborate at a meeting on 2016-07-25/27 in Norwich. Examples of extracted facts are given.
Why study Data Sharing? (+ why share your data)Heather Piwowar
A presentation to the DBMI department at the University of Pittsburgh about data sharing and reuse: what this means, why it is important, some of what we’ve learned, and what we still don’t know.
Generating Biomedical Hypotheses Using Semantic Web TechnologiesMichel Dumontier
With its focus on investigating the nature and basis for the sustained existence of living systems, modern biology has always been a fertile, if not challenging, domain for formal knowledge representation and automated reasoning. Over the past 15 years, hundreds of projects have developed or leveraged ontologies for entity recognition and relation extraction, semantic annotation, data integration, query answering, consistency checking, association mining and other forms of knowledge discovery. In this talk, I will discuss our efforts to build a rich foundational network of ontology-annotated linked data, discover significant biological associations across these data using a set of partially overlapping ontologies, and identify new avenues for drug discovery by applying measures of semantic similarity over phenotypic descriptions. As the portfolio of Semantic Web technologies continue to mature in terms of functionality, scalability and an understanding of how to maximize their value, increasing numbers of biomedical researchers will be strategically poised to pursue increasingly sophisticated KR projects aimed at improving our overall understanding of the capability and behavior of biological systems.
Amanuens.is HUmans and machines annotating scholarly literaturepetermurrayrust
about 10,000 scholarly articles ("papers") are published each day. Amanuens.is is a symbiont of ContentMine and Hypothes.is (both Shuttleworth projects/Fellows) which annotates theses using an array of controlled vocabularies ("dictionaries"). The results, in semantic form are used to annotate the original material. The talk had live demos and used plant chemistry as the examples
Presented by Richard Kidd at "The Future Information Needs of Pharmaceutical & Medicinal Chemistry", Monday 28 November 2011 at The Linnean Society, Burlington Square, London run by the RSC CICAG group.
Published on May 18, 2016 by PMR
Talk to EBI Industry group on Open Software for chemical and pharmaceutical sciences. Covers examples of chemistry , wit demos, and argues that all public knowledge should be Openly accessible
From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...GigaScience, BGI Hong Kong
Slides from GigaScience press-conference at BGI's Bio-IT APAC meeting on the GigaScience website launch and release of first unpublished animal genomes released from database. Genomes include polar bear, penguin, pigeon and macaque. 6th July 2011
Can machines understand the scientific literaturepetermurrayrust
With over 5000 scientific articles per day we need machines to help us understand the content. This material is to be used at an interactive session for the Science Society at Trinity College Cambridge UK
Can Computers understand the scientific literature (includes compscie material)TheContentMine
Published on Jan 24, 2014 by PMR
With the semantic web machines can autonomously carry out many knowledge-based tasks as well as humans. The main problems are not technical but the prevention of access to information. I advocate automatic downloading and indexing of all scientific information
Amanuens.is HUmans and machines annotating scholarly literature TheContentMine
Published on May 19, 2016 by PMR
about 10,000 scholarly articles ("papers") are published each day. Amanuens.is is a symbiont of ContentMine and Hypothes.is (both Shuttleworth projects/Fellows) which annotates theses using an array of controlled vocabularies ("dictionaries"). The results, in semantic form are used to annotate the original material. The talk had live demos and used plant chemistry as the examples
This review considers the application of CASE systems to a series of examples in which the original structures were later revised. We demonstrate how the chemical structure could be correctly elucidated if 2D NMR data were available and the expert system Structure Elucidator was employed. We will also demonstrate that if only 1D NMR spectra from the published articles were used then simply the empirical calculation of 13C chemical shifts for the hypothetical structures frequently enables a researcher to realize that the structural hypothesis is likely incorrect. We also analyze a number of erroneous structural suggestions made by highly qualified and skilled chemists. The investigation of these mistakes is very instructive and has facilitated a deeper understanding of the complicated logical-combinatorial process for deducing chemical structures.
As scientists we now have online a number of domain specific databases in chemistry for us to use. While there are hundreds of these “compound databases” available for us to access there are very few developed with the concerns of the analytical scientist in mind. ChemSpider is a free resource from the Royal Society of Chemistry hosting over 28 million chemicals from over 400 data sources and is already utilized by the mass spectrometry community in particular to aid in the process of structure verification. This presentation will give an overview of how ChemSpider has become one of the internets’ primary resources for chemists providing access to chemicals, experimental and predicted properties, patents, publications and analytical data and ultimately acting as a structure-centric hub. The importance of programming interfaces to allow for integration and how the primary mass spectrometry vendors are already utilizing ChemSpider will be discussed. The progress towards providing a dereplication platform for natural products using a combination of mass spectrometry and NMR spectroscopy data will be outlined and the path forward to a fully chemical structure-enabled internet and the importance of data interchange standards to enable this will be discussed.
At RSC we are involved with a number of projects utilizing components of Advanced Chemistry Development software including nomenclature, physchem prediction and spectroscopy integration. This presentation was given at a small ACD/Labs user meeting in Loughborough, England in February 2013. An associated movie is available on YouTube here: http://youtu.be/2kAsyRU7lRg
Powering Scientific Discovery with the Semantic Web (VanBUG 2014)Michel Dumontier
In the quest to translate the results biomedical research into effective clinical applications, many are now trying to make sense of the large and rapidly growing amount of public biomedical data. However, substantial challenges exist in traversing the currently fragmented data landscape. In this talk, I will discuss our efforts to use Semantic Web technologies to facilitate biomedical research through the formulation, publication, integration, and exploration of facts, expert knowledge, and web services.
Towards Responsible Content Mining: A Cambridge perspectivepetermurrayrust
ContentMining (Text and Data Mining) is now legal in the UK for non-commercial research. Cambridge UK is a natural centre, with several components:
* a world-class University and Library
* many publishers, both Open Access and conventional
* a digital culture
* ContentMine - a leading proponent and practitioner of mining
Cambridge University Press welcomes content mining and invited PMR to give a talk there. He showed the technology and protocols and proposed a practical way forward in 2017
High throughput mining of the scholarly literature; talk at NIHpetermurrayrust
The scientific and medical literature contains huge amounts of valuable unused information. This talk shows how to discover it, extract, re-use and interpret it. Wikidata is presented as a key new tool and infrastructure. Everyone can become involved. However some of the barriers to use are sociopolitical and these are identified and discussed.
With its focus on investigating the basis for the sustained existence
of living systems, modern biology has always been a fertile, if not
challenging, domain for formal knowledge representation and automated
reasoning. With thousands of databases and hundreds of ontologies now
available, there is a salient opportunity to integrate these for
discovery. In this talk, I will discuss our efforts to build a rich
foundational network of ontology-annotated linked data, develop
methods to intelligently retrieve content of interest, uncover
significant biological associations, and pursue new avenues for drug
discovery. As the portfolio of Semantic Web technologies continue to
mature in terms of functionality, scalability, and an understanding of
how to maximize their value, researchers will be strategically poised
to pursue increasingly sophisticated KR projects aimed at improving
our overall understanding of human health and disease.
bio: Dr. Michel Dumontier is an Associate Professor of Medicine
(Biomedical Informatics) at Stanford University. His research aims to
find new treatments for rare and complex diseases. His research
interest lie in the publication, integration, and discovery of
scientific knowledge. Dr. Dumontier serves as a co-chair for the World
Wide Web Consortium Semantic Web in Health Care and Life Sciences
Interest Group (W3C HCLSIG) and is the Scientific Director for
Bio2RDF, a widely used open-source project to create and provide
linked data for life sciences.
Mining the scientific literature for plants and chemistrypetermurrayrust
ContentMine can read the daily scientific literature and extract facts. This talk was given to the OpenPlant project - with whom ContentMine collaborate at a meeting on 2016-07-25/27 in Norwich. Examples of extracted facts are given.
Why study Data Sharing? (+ why share your data)Heather Piwowar
A presentation to the DBMI department at the University of Pittsburgh about data sharing and reuse: what this means, why it is important, some of what we’ve learned, and what we still don’t know.
Generating Biomedical Hypotheses Using Semantic Web TechnologiesMichel Dumontier
With its focus on investigating the nature and basis for the sustained existence of living systems, modern biology has always been a fertile, if not challenging, domain for formal knowledge representation and automated reasoning. Over the past 15 years, hundreds of projects have developed or leveraged ontologies for entity recognition and relation extraction, semantic annotation, data integration, query answering, consistency checking, association mining and other forms of knowledge discovery. In this talk, I will discuss our efforts to build a rich foundational network of ontology-annotated linked data, discover significant biological associations across these data using a set of partially overlapping ontologies, and identify new avenues for drug discovery by applying measures of semantic similarity over phenotypic descriptions. As the portfolio of Semantic Web technologies continue to mature in terms of functionality, scalability and an understanding of how to maximize their value, increasing numbers of biomedical researchers will be strategically poised to pursue increasingly sophisticated KR projects aimed at improving our overall understanding of the capability and behavior of biological systems.
Amanuens.is HUmans and machines annotating scholarly literaturepetermurrayrust
about 10,000 scholarly articles ("papers") are published each day. Amanuens.is is a symbiont of ContentMine and Hypothes.is (both Shuttleworth projects/Fellows) which annotates theses using an array of controlled vocabularies ("dictionaries"). The results, in semantic form are used to annotate the original material. The talk had live demos and used plant chemistry as the examples
Presented by Richard Kidd at "The Future Information Needs of Pharmaceutical & Medicinal Chemistry", Monday 28 November 2011 at The Linnean Society, Burlington Square, London run by the RSC CICAG group.
Published on May 18, 2016 by PMR
Talk to EBI Industry group on Open Software for chemical and pharmaceutical sciences. Covers examples of chemistry , wit demos, and argues that all public knowledge should be Openly accessible
From Deadly E. coli to Endangered Polar Bear: GigaScience Provides First Cita...GigaScience, BGI Hong Kong
Slides from GigaScience press-conference at BGI's Bio-IT APAC meeting on the GigaScience website launch and release of first unpublished animal genomes released from database. Genomes include polar bear, penguin, pigeon and macaque. 6th July 2011
Can machines understand the scientific literaturepetermurrayrust
With over 5000 scientific articles per day we need machines to help us understand the content. This material is to be used at an interactive session for the Science Society at Trinity College Cambridge UK
Can Computers understand the scientific literature (includes compscie material)TheContentMine
Published on Jan 24, 2014 by PMR
With the semantic web machines can autonomously carry out many knowledge-based tasks as well as humans. The main problems are not technical but the prevention of access to information. I advocate automatic downloading and indexing of all scientific information
Amanuens.is HUmans and machines annotating scholarly literature TheContentMine
Published on May 19, 2016 by PMR
about 10,000 scholarly articles ("papers") are published each day. Amanuens.is is a symbiont of ContentMine and Hypothes.is (both Shuttleworth projects/Fellows) which annotates theses using an array of controlled vocabularies ("dictionaries"). The results, in semantic form are used to annotate the original material. The talk had live demos and used plant chemistry as the examples
This review considers the application of CASE systems to a series of examples in which the original structures were later revised. We demonstrate how the chemical structure could be correctly elucidated if 2D NMR data were available and the expert system Structure Elucidator was employed. We will also demonstrate that if only 1D NMR spectra from the published articles were used then simply the empirical calculation of 13C chemical shifts for the hypothetical structures frequently enables a researcher to realize that the structural hypothesis is likely incorrect. We also analyze a number of erroneous structural suggestions made by highly qualified and skilled chemists. The investigation of these mistakes is very instructive and has facilitated a deeper understanding of the complicated logical-combinatorial process for deducing chemical structures.
As scientists we now have online a number of domain specific databases in chemistry for us to use. While there are hundreds of these “compound databases” available for us to access there are very few developed with the concerns of the analytical scientist in mind. ChemSpider is a free resource from the Royal Society of Chemistry hosting over 28 million chemicals from over 400 data sources and is already utilized by the mass spectrometry community in particular to aid in the process of structure verification. This presentation will give an overview of how ChemSpider has become one of the internets’ primary resources for chemists providing access to chemicals, experimental and predicted properties, patents, publications and analytical data and ultimately acting as a structure-centric hub. The importance of programming interfaces to allow for integration and how the primary mass spectrometry vendors are already utilizing ChemSpider will be discussed. The progress towards providing a dereplication platform for natural products using a combination of mass spectrometry and NMR spectroscopy data will be outlined and the path forward to a fully chemical structure-enabled internet and the importance of data interchange standards to enable this will be discussed.
At RSC we are involved with a number of projects utilizing components of Advanced Chemistry Development software including nomenclature, physchem prediction and spectroscopy integration. This presentation was given at a small ACD/Labs user meeting in Loughborough, England in February 2013. An associated movie is available on YouTube here: http://youtu.be/2kAsyRU7lRg
Mobile hardware and software technology continues to evolve very rapidly and presents drug discovery scientists with new platforms for accessing data and performing data analysis. Smartphones and tablet computers can now be used to perform many of the operations previously addressed by laptops or desktop computers. Although the smaller screen sizes and requirements for touch screen manipulation can present user interface design challenges, especially with chemistry related applications, these limitations are driving innovative solutions. In this early review of the topic, we collectively present our diverse experiences as software developer, chemistry database expert and naïve user, in terms of what mobile platforms may provide to the drug discovery chemist in the way of apps in the future as this disruptive technology takes off.
I spent the weekend working on a new experiment to show at my kids school. Because of my experience with Nuclear Magnetic Resonance I wondered whether or not magnetic seeding of the hydration spheres would facilitate their growth. Using the BEMEWS catalyst and commercial borax as the materials I got spectacular results...but I need to figure out now how to extend the lifetime of the spheres now!
The elucidated structure of asperjinone, a natural product isolated from thermophilic Aspergillus terreus, was revised using the expert system Structure Elucidator. The reliability of the revised structure was confirmed using 180 structures containing the (3,3-dimethyloxiran-2-yl)methyl fragment as a basis for comparison and whose chemical shifts contradict the suggested structure.
In the last 6 years high-throughput screening has been used to identify FDA approved drugs that are active against multiple targets (also termed promiscuity). We have identified 34 studies that have screened libraries of FDA approved drugs against various whole cell or target assays. Each study has identified one or more compound with a new bioactivity that had not been previously described. Thirteen of these drugs were active against more than one additional disease, thereby suggesting a degree of promiscuity. The 109 molecules identified by screening in vitro were statistically more hydrophobic than orphan designated products with at least one marketing approval for a common disease indication or one marketing approval for a rare disease (FDA rare disease research database). We have created a database of in vitro data on old drugs for new uses that could be applied for repositioning these or other molecules for neglected and rare diseases.
On 14th July 2014 a memorial symposium to celebrate the life and work of Professor Jean-Claude Bradley, the father of Open Notebook Science, used this photo loop to connect us to some of his activities and give us a glimpse into his personal life.
The number of social networking sites available to scientists continues to grow. We are being indexed and exposed on the internet via our publications, presentations and data. We have many ways to contribute, annotate and curate, many of them as part of a growing crowdsourcing network. As one of the founders of the online ChemSpider database I was drawn into the world of social networking to participate in the discussions that were underway regarding our developing resource. As a result of my experiences in blogging, and as a result of developing collaborations and engagement with a large community of scientists, I have become very immersed in the expanding social networks for science. This presentation will provide an overview of the various types of networking and collaborative sites available to scientists and ways that I expose my scientific activities online. Many of these activities will ultimately contribute to the developing measures of me as a scientist as identified in the new world of alternative metrics.
Science has evolved from the isolated individual tinkering in the lab, through the era of the “gentleman scientist” with his or her assistant(s), to group-based then expansive collaboration and now to an opportunity to collaborate with the world. With the advent of the internet the opportunity for crowd-sourced contribution and large-scale collaboration has exploded and, as a result, scientific discovery has been further enabled. The contributions of enormous open data sets, liberal licensing policies and innovative technologies for mining and linking these data has given rise to platforms that are beginning to deliver on the promise of semantic technologies and nanopublications, facilitated by the unprecedented computational resources available today, especially the increasing capabilities of handheld devices. The speaker will provide an overview of his experiences in developing a crowdsourced platform for chemists allowing for data deposition, annotation and validation. The challenges of mapping chemical and pharmacological data, especially in regards to data quality, will be discussed. The promise of distributed participation in data analysis is already in place.
The Royal Society of Chemistry provides access to a number of databases hosting chemicals data, reactions, spectroscopy data and prediction services. These databases and services can be accessed via web services utilizing queries using standard data formats such as InChI and molfiles. Data can then be downloaded in standard structure and spectral formats allowing for reuse and repurposing. The ChemSpider database integrates to a number of projects external to RSC including Open PHACTS that integrates chemical and biological data. This project utilizes semantic web data standards including RDF. This presentation will provide an overview of how structure and spectral data standards have been critical in allowing us to integrate many open source tools, ease of integration to a myriad of services and underpin many of our future developments.
Many students spend enormous amounts of their time engaged with their computers, accepting of course that mobile devices are simply computers of a different form factor. Engaged with the social networks, utilizing computer platforms to source and share content of various forms, their contributions of “data” into what is the cloud, and in many cases a void, is enormous. What community and career benefit might result from those students spending some of their time contributing chemistry related data to the world? What challenges lie in the way of their participation and how might participating have a positive, or negative impact on their future career. The Royal Society of Chemistry hosts a number of chemistry data platforms to which students can actively contribute and for which their participation can be measured. Moreover the RSC’s micropublishing platform allows chemists to learn how to write up their scientific work, obtain review from their peers and chemistry professors in a non-threatening environment and produce an online published work in less than day that is both citable and available as a shared resource for the community. This presentation will demonstrate how to participate and encourage engagement from students early in their education. There are no longer any technology barriers to the sharing of the majority of chemistry related data.
Many of us nowadays invest significant amounts of time in sharing our activities and opinions with friends and family via social networking tools. However, despite the availability of many platforms for scientists to connect and share with their peers in the scientific community the majority do not make use of these tools, despite their promise and potential impact and influence on our future careers. We are being indexed and exposed on the internet via our publications, presentations and data. We also have many more ways to contribute to science, to annotate and curate data, to “publish” in new ways, and many of these activities are as part of a growing crowdsourcing network. This presentation will provide an overview of the various types of networking and collaborative sites available to scientists and ways to expose your scientific activities online. Many of these can ultimately contribute to the developing measures of you as a scientist as identified in the new world of alternative metrics. Participating offers a great opportunity to develop a scientific profile within the community and may ultimately be very beneficial, especially to scientists early in their career.
Accessing 3D Printable Chemical Structures Online
We have been exploring routes to create 3D printable chemical structure files (.WRL and .STL). These digital 3D files can be generated directly from crystallographic information files (.CIF) using a variety of software packages such as Jmol. After proper conversion to the .STL (or .WRL) file format, the chemical structures can be fabricated into tangible plastic models using 3D printers. This technique can theoretically be used for any molecular or solid structure. Researchers and educators are no longer limited to building models via traditional piecewise plastic model kits. As such, 3D printed molecular models have tremendous value for teaching and research. As the number of available 3D printable structures continues to grow, there is a need for a robust chemical database to store these files. This presentation will discuss our efforts to incorporate 3D printable chemical structures within the Royal Society of Chemistry’s online compound database.
I have started using a new website, www.growkudos.com, to help me enrich, expose and measure my publications. This is VERY EARLY in my exposure and usage of the platform but I am already excited by the possibilities. The resulting KUDOS page is here: https://www.growkudos.com/articles/10.1371/journal.pone.0062325
Access to scientific information has changed in a manner that was likely never even imagined by the early pioneers of the internet. The quantities of data, the array of tools available to search and analyze, the devices and the shift in community participation continues to expand while the pace of change does not appear to be slowing. ChemSpider is one of the chemistry community’s primary online public compound databases. Containing tens of millions of chemical compounds and its associated data ChemSpider serves data tens of thousands of chemists every day and it serves as the foundation for many important international projects to integrate chemistry and biology data, facilitate drug discovery efforts and help to identify new chemicals from under the ocean. This presentation will provide an overview of the expanding reach of this eScience cheminformatics platform and the nature of the solutions that it helps to enable including structure validation, text mining and semantic markup, the National Chemical Database Service for the United Kingdom and the development of a chemistry data repository. We will also discuss the possibilities it offers in the domain of crowdsourcing and open data sharing. The future of scientific information and communication will be underpinned by these efforts, influenced by increasing participation from the scientific community and facilitated collaboration and ultimately accelerate scientific progress.
There is an overwhelming number of new resources for chemistry that would likely benefit both librarians and students in terms of improving access to data and information. While commercial solutions provided by an institution may be the primary resources there is now an enormous range of online tools, databases, resources, apps for mobile devices and, increasingly, wikis. This presentation will provide an overview of how wiki-based resources for scientists are developing and will introduce a number of developing wikis. These include wikis that are being used to teach chemistry to students as well as to source information about scientists, scientific databases and mobile apps.
As a result of the advent of internet technologies supporting participation on the internet via blogs, wikis and other social networking approaches, chemists now have an opportunity to contribute to the growing chemistry content on the web. As scientists an important skill to develop is the ability to succinctly report in a published format the details of scientific experimentation. The Royal Society of Chemistry provides a number of online systems to share chemistry data, the most well known of these being the ChemSpider database. In parallel the ChemSpider SyntheticPages (CSSP) platform is an online publishing platform for scientists, and especially students, to publish the details of chemical syntheses that they have performed. Using the rich capabilities of internet platforms, including the ability to display interactive spectral data and movies, CSSP is an ideal environment for students to publish their work, especially syntheses that might not support mainstream publication.
Our access to scientific information has changed in ways that were hardly imagined even by the early pioneers of the internet. The immense quantities of data and the array of tools available to search and analyze online content continues to expand while the pace of change does not appear to be slowing. ChemSpider is one of the chemistry community’s primary online public compound databases. Containing tens of millions of chemical compounds and its associated data ChemSpider serves data tens of thousands of chemists every day and it serves as the foundation for many important international projects to integrate chemistry and biology data, facilitate drug discovery efforts and help to identify new chemicals from under the ocean. This presentation will provide an overview of the expanding reach of the ChemSpider platform and the nature of the solutions that it helps to enable. We will also discuss the possibilities it offers in the domain of crowdsourcing and open data sharing. The future of scientific information and communication will be underpinned by these efforts, influenced by increasing participation from the scientific community and facilitated collaboration and ultimately accelerate scientific progress.
With the recent announcement that GlaxoSmithKline have released a huge tranche of whole cell malaria screening data to the public domain, accompanied by a corresponding publication, this raises some issues for consideration before this exemplar instance becomes a trend. We have examined the data from a high level, by studying the molecular properties, and consider the various alerts presently in use by major pharma companies. We acknowledge the potential value of such data but also raise the issue of the actual value of such datasets released into the public domain. We also suggest approaches that could enhance the value of such datasets to the community and theoretically offer more immediate benefit to the search for leads for other neglected diseases.
The Open Drug Discovery Teams (ODDT) project provides a mobile app primarily intended as a research topic aggregator of predominantly open science data collected from various sources on the internet. It exists to facilitate interdisciplinary teamwork and to relieve the user from data overload, delivering access to information that is highly relevant and focused on their topic areas of interest. Research topics include areas of chemistry and adjacent molecule-oriented biomedical sciences, with an emphasis on those which are most amenable to open research at present. These include rare and neglected diseases, and precompetitive and public-good initiatives such as green chemistry.
The ODDT project uses a free mobile app as user entry point. The app has a magazine-like interface, and server-side infrastructure for hosting chemistry-related data as well as value added services. The project is open to participation from anyone and provides the ability for users to make annotations and assertions, thereby contributing to the collective value of the data to the engaged community. Much of the content is derived from public sources, but the platform is also amenable to commercial data input. The technology could also be readily used in-house by organizations as a research aggregator that could integrate internal and external science and discussion. The infrastructure for the app is currently based upon the Twitter API as a useful proof of concept for a real time source of publicly generated content. This could be extended further by accessing other APIs providing news and data feeds of relevance to a particular area of interest. As the project evolves, social networking features will be developed for organizing participants into teams, with various forms of communication and content management possible.
This presentation outlines a mechanism for using the power of "Big Data", social networking and technology infrastructure to speed the process of curing a horrible disease.
In the world of pharma precompetitive efforts are increasing. These developments have created a dynamic ecosystem with pharma as smaller nodes in a complex network, in which collaborations have become an important business model.
DRUGS New agreement to tackle pharmaceutical pollution p.1AlyciaGold776
DRUGS New agreement to
tackle pharmaceutical
pollution p.164
WORLD VIEW Vaccination
the best way to measure
health care p.165
DUNG OVER Rolling beetles
fooled by look-alike
seeds p.167
Let’s think about cognitive bias
The human brain’s habit of finding what it wants to find is a key problem for research. Establishing
robust methods to avoid such bias will make results more reproducible.
“Ever since I first learned about confirmation bias I’ve been see-ing it everywhere.” So said British author and broadcaster Jon Ronson in So You’ve Been Publicly Shamed (Picador, 2015).
You will see a lot of cognitive bias in this week’s Nature. In a series
of articles, we examine the impact that bias can have on research, and
the best ways to identify and tackle it. One enemy of robust science
is our humanity — our appetite for being right, and our tendency to
find patterns in noise, to see supporting evidence for what we already
believe is true, and to ignore the facts that do not fit.
The sources and types of such cognitive bias — and the fallacies they
produce — are becoming more widely appreciated. Some of the prob-
lems are as old as science itself, and some are new: the IKEA effect, for
example, describes a cognitive bias among consumers who place artifi-
cially high value on products that they have built themselves. Another
common fallacy in research is the Texas sharp-shooter effect — fir-
ing off a few rounds and then drawing a bull’s eye around the bullet
holes. And then there is asymmetrical attention: carefully debugging
analyses and debunking data that counter a favoured hypothesis, while
letting evidence in favour of the hypothesis slide by unexamined.
Such fallacies sound obvious and easy to avoid. It is easy to think that
they only affect other people. In fact, they fall naturally into investiga-
tors’ blind spots (see page 182).
Advocates of robust science have repeatedly warned against cogni-
tive habits that can lead to error. Although such awareness is essential,
it is insufficient. The scientific community needs concrete guidance on
how to manage its all-too-human biases and avoid the errors they cause.
That need is particularly acute in statistical data analysis, where
some of the best-established methods were developed in a time before
data sets were measured in terabytes, and where choices between tech-
niques offer abundant opportunity for errors. Proteomics and genom-
ics, for example, crunch millions of data points at once, over thousands
of gene or protein variants. Early work was plagued by false positives,
before the spread of techniques that could account for the myriad
hypotheses that such a data-rich environment could generate.
Although problems persist, these fields serve as examples of commu-
nities learning to recognize and curb their mistakes. Another example is
the venerable practice of double-blind studies. But more effort is needed,
particularly in what some have called evidence- ...
Slides of talk "Open Science, Open Data, Science 2.0: What Are They and Why Should Medical Librarians Care?" given at the 2010 annual meeting of the Pacific Northwest Chapter of the Medical Library Association.
An overview of the book to be published by Wiley "Collaborative Computational Technologies for the Life Sciences" Edited by Sean Ekins, M.Sc., Ph.D., D.Sc., Maggie A.Z. Hupcey Ph.D. and Antony J. Williams, Ph.D. published by Wiley, as part of the Technologies for the Pharmaceutical Industry Series
Exploración de un modelo de gobernanza y gestión colectiva ciudadana de los datos de salud
Este modelo permitiría a los ciudadanos compartir sus datos de salud para acelerar la investigación y la innovación con el fin de maximizar los beneficios sociales y colectivos.
Participant-centered research design and “equal access” data sharing practice...Jason Bobe
Topics include:
What is "equal access" to data?
How have the roles of human subjects expanded over time?
Where has equal access to data been a success?
What are the barriers to equal access in research?
Palestra apresentada à CONFOA 2013 (Universidade de São Paulo, São Paulo, Brasil, de 06 a 08 de outubro de 2013) na Mesa III - A ciência aberta e a gestão de dados de pesquisa - pelo Prof. Dr. Peter Elias – REINO UNIDO - The Royal Society of UK.
Similar to Reaching out to collaborators and crowdsourcing for pharmaceutical research (20)
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Reaching out to collaborators and crowdsourcing for pharmaceutical research
1. Reaching Out To Collaborators: Crowdsourcing for Pharmaceutical Research
Sean Ekins1, 2, 3, 4 and Antony J. Williams5
1
Collaborations in Chemistry, 601 Runnymede Avenue, Jenkintown, PA 19046, U.S.A.
2
Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA
94403.
3
Department of Pharmaceutical Sciences, University of Maryland, MD 21201, U.S.A.
4
Department of Pharmacology, University of Medicine & Dentistry of New Jersey
(UMDNJ)-Robert Wood Johnson Medical School, 675 Hoes lane, Piscataway, NJ 08854.
5
Royal Society of Chemistry, 904 Tamaras Circle, Wake Forest, NC-27587.
Running Head: CrowdSourcing
Corresponding Authors: Sean Ekins, Collaborations in Chemistry, 601 Runnymede
Ave, Jenkintown, PA 19046, Email ekinssean@yahoo.com, Tel 215-687-1320. Antony J.
Williams, Royal Society of Chemistry, 904 Tamaras Circle, Wake Forest, NC-27587.
antony.williams@chemspider.com
1
2. Doing more with less resources used to be a situation common just for academic
scientists. This is unfortunately still true for academics but we are seeing others facing
many of the same challenges. With the squeeze on budgets and cost cutting resulting
from recent worldwide economic challenges, the failure of many drugs to make it through
the pipeline to the market, and the increasing costs associated with the drug development
process, we are now seeing in the pharmaceutical industry a dramatic shift, perhaps
belatedly, to have to accommodate similar challenges of doing more with less. It could
also represent the further crumbling of a 150 year old plus paradigm of the large
company being the predominant source for developing therapeutics for profit. We are
also seeing increased discussion about different models of facilitating pharmaceutical
research as well as the suggestion of opportunities to collaborate and use tools that
perhaps would not have been considered in the past (1-4). This shift to “openness” in
certain areas, specifically the sharing of pre-competitive data and processes, parallels the
societal shifts we have seen in so many areas of open source software development, the
sharing of data, and the utility of free data resources and repositories such as PubChem
(http://pubchem.ncbi.nlm.nih.gov/) and others (see Table 1). From the extreme of
keeping entire projects in house there is a shift to decentralized research. One view of
pharmaceutical research is to use loose networks of external researchers from companies,
academics or consultants, create a community around a shared interest and gather their
ideas. We think this comes closest to the ideal of crowdsourcing where the wisdom of the
many and their varied perspectives can benefit community-based efforts whether in
software, knowledge capture etc. The loose definition of crowdsourcing as “outsourcing a
task to a group or community of people in an open call” is a relatively new phenomenon,
2
3. culture or movement which is best summarized in the book “Wikinomics, How Mass
Collaboration Changes Everything” (5). Living in our connected world, pharmaceutical
researchers can communicate in a variety of ways (4) to leverage ideas from around the
globe. These ideas do not have to come from within the walls of a single organisation.
Taking this further: why limit access to just ideas? Open tools and data could feed an
ecosystem. It could also breed a new class of researcher without affiliation who has
allegiance to neither company nor research organization. They test their hypotheses with
data from elsewhere, they do their experiments through a network of collaborations, they
may have no physical lab, while a shared cause may not be essential, confidentiality
agreements and software may unite them as a loose cooperative etc. Such approaches
may become more commonplace like the Open Innovation efforts represented by
companies such as NineSigma (http://en.wikipedia.org/wiki/Ninesigma) and Innocentive
(http://en.wikipedia.org/wiki/Innocentive). The One Billion Minds approach for open
innovation (http://www.onebillionminds.com/) has already been mapped into the Life
Sciences where a million minds in the community has been called to participate in
community annotation in Wikiproteins (http://genomebiology.com/2008/9/5/R89).
A recent example of the power of crowdsourcing is the availability of freely
accessible online resources to enable and support drug discovery. For instance, online
databases including PubChem, Chemical Entities of Biological Interest, or ChEBI
database (http://www.ebi.ac.uk/chebi/), DrugBank (http://www.drugbank.ca/), the Human
Metabolome Database (www.hmdb.ca.) and ChemSpider (http://www.chemspider.com/)
represent good examples (6-8) in addition to commercial databases (9) and collaborative
systems like CDD (http://www.collaborativedrug.com). These represent either
3
4. government or privately funded initiatives with vastly differing resources and scopes.
Chemistry (and with it biology) information on the internet has thus become more
accessible just as we are seeing a massive increase in screening data coming from
individual laboratories. Sometimes there are synergistic benefits of crowdsourcing for
example, the efforts behind the ChemSpider platform, originally a hobby project housed
from a basement, and recently acquired by the Royal Society of Chemistry, has been
acknowledged to have greatly enriched the content in the NIH’s PubChem (9). We are
also seeing crowdsourcing applied to get more perspectives on a problem, for example
the annotation of 64 putative tools and probes from the NIH Roadmap MLSCN effort by
scientists from different groups, using multiple filtering methods or molecule quality
metrics (10).
What does the future hold for such databases and other crowdsourcing efforts and
what are some of the challenges to be aware of? While access to very large datasets as a
starting point for biological information and modeling may be of value there should be
concerns regarding the quality of the compounds used for screening e.g. will there be a
high percentage of false positives? What about the fidelity of the data? Is the same batch
of compound used by different groups? Are there experimental differences that result in
large inter-lab differences in the manner in which they use technologies (11)? Do cell
passage numbers differ? Are the internal standards the same? What is the diet of the
animals used? What is the impact of dissolution variance (12) ? Will the naïve user
actually be able to dissect out the false positives or issues with data curation (13) which
may represent a potential pressure point? What about issues with data protection,
anonymity, ethics and tissue handling (14). There are a myriad of other related questions.
4
5. On the opportunity side there may be some obvious value in the smaller scale
experiments from individual laboratories being stored in a single location. Perhaps we
can learn from the systems biology or network building software community that have
either manually or automatically annotated large databases (instances of object X
interacts with object Y, either directly or indirectly) from individual experiments (15).
Rarely is there kinetic data captured in these efforts, and yet if a database of such
information could be created this would become accessible. We can therefore see a need
driven by the academic community predominantly for the curation of their single
experiments in biology with benefits for preventing repetition and possible decrease of
animal and reagent usage. This drive to curate biology can be encouraged by publishers
and funding agencies but once annotated in a desirable format (e.g. there would be a need
to capture the experimental protocol and an ontology would be essential (16)) and
location, the information could be freely available for other efforts whether in data
mining, SAR, software development or network building. The goal should be to bring
scientists to a point were their data is shared and useful. It is one thing to provide large
supplemental files with publications, but it is another to put the data in a location and
format so that others can potentially learn from it. Perhaps Pharmaceutical Research (and
for that matter other pharmaceutical journals) could play a role in ensuring that data
within articles in the journal is deposited with freely accessible databases such as
PubChem and ChemSpider and beyond. Ultimately we foresee there will be a highly
networked structure linking the many crowdsourced database efforts to reduce
redundancy. While we have already seen a dramatic growth in accessible databases, the
innovation around computational methods for data analysis and mining have really not
5
6. kept pace (13). There is an opportunity here for the scientific community to address these
needs and we may see a new wave of informatics company innovation. This could be
catalyzed by public or private funding or even crowdsourcing X-prize type awards
(http://www.xprize.org/future-x-prizes/life-sciences).
Perhaps there also needs to be some degree of focus initially to such an open drug
discovery model to increase the probability of success, maybe around a neglected disease
like Malaria or Tuberculosis (TB), or even rapidly emerging diseases (like swine flu), to
demonstrate that it is more than a utopian concept. The incentive here could be that these
diseases are rapidly becoming of more concern globally, and could increase demand on
healthcare resources (e.g. the reemergence of TB and ease of transmission). Targeted
questions could be posed to the crowd regarding approaches to surmounting TB drug
resistance, latency, target identification or developing novel delivery mechanisms (17,
18).
As individuals we are continually challenged by demands on our time and
resources, both financial and intellectual, and participating in such efforts would surely
be a big motivating factor for many. Some companies allow their employees to pursue
personal projects as a small percentage of their time to foster creativity. Why not allow
them to give back in the way by contributing to open-source science? Perhaps
governments can recognize the potential benefits and provide participating companies tax
credits or other incentives. For many the motivation to participate in open drug discovery
may not be financial but purely philanthropic in nature or simply the pursuit of an
intellectual challenge. Think of it as the ultimate challenge where scientists collaborate
with thousands of people to help global health. We would welcome suggestions from the
6
7. various stakeholders with an interest in all aspects of the pharmaceutical R&D value
chain on how open pharmaceutical collaborations could be facilitated. This is certainly an
unsettling time in the industry but after the storm has settled we may be in a unique
position to do further aspects of R&D differently and more cost-effectively, with
implications for the whole scientific community and global healthcare. Less may indeed
be more.
Acknowledgments
We are grateful to many discussions with colleagues on this topic as well as the reviewers
suggestions.
Conflicts of Interest
SE consults for Collaborative Drug Discovery, Inc on a Bill and Melinda Gates
Foundation Grant#49852 “Collaborative drug discovery for TB through a novel database
of SAR data optimized to promote data archiving and sharing”. He is also on the advisory
board for ChemSpider. AJW is employed by the Royal Society of Chemistry which owns
ChemSpider and associated technologies.
7
8. References
1. A. Bingham and S. Ekins. Competitive Collaboration in the Pharmaceutical and
Biotechnology Industry. Drug Disc Today. In Press: (2009).
2. A.J. Hunter. The Innovative Medicines Initiative: a pre-competitive initiative to
enhance the biomedical science base of Europe to expedite the development of
new medicines for patients. Drug Discov Today. 13:371-373 (2008).
3. M.R. Barnes, L. Harland, S.M. Foord, M.D. Hall, I. Dix, S. Thomas, B.I.
Williams-Jones, and C.R. Brouwer. Lowering industry firewalls: pre-competitive
informatics initiatives in drug discovery. Nat Rev Drug Discov. 8:701-708 (2009).
4. D.S. Bailey and E.D. Zanders. Drug discovery in the era of Facebook--new tools
for scientific networking. Drug Discov Today. 13:863-868 (2008).
5. D. Tapscott and A.J. Williams. Wikinomics: How Mass Collaboration Changes
Everything Portfolio, New York, 2006.
6. A.J. Williams. A perspective of publicly accessible/open-access chemistry
databases. Drug Discov Today. 13:495-501 (2008).
7. A.J. Williams. Internet-based tools for communication and collaboration in
chemistry. Drug Discov Today. 13:502-506 (2008).
8. S. Louise-May, B. Bunin, and S. Ekins. Towards integrated web-based tools in
drug discovery. Touch Briefings - Drug Discovery. in Press: (2009).
8
9. 9. C. Southan, P. Varkonyi, and S. Muresan. Quantitative assessment of the
expanding complementarity between public and commercial databases of
bioactive compounds. J Cheminformatics. 1:10 (2009).
10. T.I. Oprea, C.G. Bologa, S. Boyer, R.F. Curpan, R.C. Glen, A.L. Hopkins, C.A.
Lipinski, G.R. Marshall, Y.C. Martin, L. Ostopovici-Halip, G. Rishton, O. Ursu,
R.J. Vaz, C. Waller, H. Waldmann, and L.A. Sklar. A crowdsourcing evaluation
of the NIH chemical probes. Nat Chem Biol. 5:441-447 (2009).
11. H. Wang, X. He, M. Band, C. Wilson, and L. Liu. A study of inter-lab and inter-
platform agreement of DNA microarray data. BMC Genomics. 6:71 (2005).
12. G. Deng, A.J. Ashley, W.E. Brown, J.W. Eaton, W.W. Hauck, L.C. Kikwai, M.R.
Liddell, R.G. Manning, J.M. Munoz, P. Nithyanandan, M.J. Glasgow, E. Stippler,
S.Z. Wahab, and R.L. Williams. The USP Performance Verification Test, Part I:
USP Lot P Prednisone Tablets: quality attributes and experimental variables
contributing to dissolution variance. Pharm Res. 25:1100-1109 (2008).
13. A.J. Williams, V. Tkachenko, C. Lipinski, A. Tropsha, and S. Ekins. Free Online
Resources Enabling Crowdsourced Drug Discovery. Drug Discovery World.
Submitted: (2009).
14. E.B. van Veen. Obstacles to European research projects with data and tissue:
solutions and further challenges. Eur J Cancer. 44:1438-1450 (2008).
15. S. Ekins. Systems-ADME/Tox: Resources and network approaches. J Pharmacol
Toxicol Methods. 53:38-66 (2006).
9
10. 16. R. Hoehndorf, J. Bacher, M. Backhaus, S.E. Gregorio, Jr., F. Loebe, K. Prufer, A.
Uciteli, J. Visagie, H. Herre, and J. Kelso. BOWiki: an ontology-based wiki for
annotation of data and integration of knowledge in biology. BMC bioinformatics.
10 Suppl 5:S5 (2009).
17. T.S. Balganesh, P.M. Alzari, and S.T. Cole. Rising standards for tuberculosis drug
development. Trends Pharmacol Sci. 29:576-581 (2008).
18. J.C. Sacchettini, E.J. Rubin, and J.S. Freundlich. Drugs versus bugs: in pursuit of
the persistent predator Mycobacterium tuberculosis. Nature reviews. 6:41-52
(2008).
10
11. Table 1. Examples of crowdsourcing resources for pharmaceutical research.
Name Website Function
myExperiment http://www.myexperiment.org/ Workflows, communities
DIYbio http://diybio.org/ Community for do it yourself
biologists
Protocol online http://protocol-online.org/ Biology protocols
Open wetware http://openwetware.org/wiki/Main Materials, protocols and resources
_Page
Open Notebook http://onschallenge.wikispaces.co Crowdsourced science challenge –
science m/ initially on solubility measurement
challenge
UsefulChem http://usefulchem.wikispaces.com/ An example of one scientist’s open
project notebook
Laboratree http://laboratree.org/pages/home Science networking site
Science http://sciencecommons.org/ Strategies and tools faster, efficient
Commons web-enabled scientific research
WikiPathways http://www.wikipathways.org/inde Curated biological pathways
x.php/WikiPathways
Open Source http://www.osdd.net/home Collaboration around genomics and
Drug Discovery computational technologies
11