ChemSpider was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge. There are many tens of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them. Despite the diversity of databases available online their inherent quality, accuracy and completeness is lacking in many regards. ChemSpider was established to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data and experimental properties. ChemSpider has now grown into a database of well over 20 million chemical substances integrated with over 300 disparate data sources, many of these directly supporting the Life Sciences. This presentation will provide an overview of our efforts to improve the quality of data online, to provide a foundation for the semantic web for chemistry and to provide access to a set online tools and services to support access to these data. I will also discuss how ChemSpider is being used to enhance Semantic Publishing in Chemistry at RSC.
RSC|ChemSpider is one of the world’s largest online resources for chemistry related data and services. Developed with the intention of delivering access to structure-based chemistry data via the internet the ChemSpider platform hosts over 26 million unique chemical compounds aggregated from over 400 data sources and provides an environment for the community to both annotate and curate these existing data as well as deposit new data to the system. The search system delivers flexible querying capabilities together with links to external sites for publication and patent data. ChemSpider has spawned a number of projects include ChemSpider SyntheticPages for hosting openly peer-reviewed chemical synthesis articles. This presentation will review the present capabilities of the ChemSpider system providing direct examples of how to use the system to source high quality data of value to pharmaceutical companies. We will discuss some of the challenges associated with validating data quality, examine how ChemSpider is a part of the semantic web for chemistry and investigate approaches to using ChemSpider integrated to analytical instrumentation.
ChemSpider was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge. There are many tens of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them. Despite the diversity of databases available online their inherent quality, accuracy and completeness is lacking in many regards. ChemSpider was established to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data and experimental properties. ChemSpider has now grown into a database of over 20 million chemical substances integrated with over 300 disparate data sources, many of these directly supporting the Life Sciences. This presentation will provide an overview of our efforts to improve the quality of data online, to provide a foundation for the semantic web for chemistry and to provide access to a set online tools and services to support access to these data. I will also discuss how ChemSpider is being used to enhance Semantic Publishing in Chemistry at RSC.
The internet has provided access to unprecedented quantities of data. In the domain of chemistry specifically over the past decade the web has become populated with tens of millions of chemical structures and related properties of assays together with tens of thousands of spectra and syntheses. The data have, to a large extent, remained disparate and disconnected. In recent years with the wave of Web 2.0 participation any chemist can contribute to both the sharing and validation of chemistry-related data whether it be via Wikipedia, the online encyclopedia, or one of the multiple public compound databases. The presentation will offer a perspective of what is available today, our experiences of building a public compound database to link together the internet and a suggested path forward for enabling even greater integration and connectivity for chemistry data for the masses to both use and participate in developing.
RSC|ChemSpider is one of the world’s largest online resources for chemistry related data and services. Developed with the intention of delivering access to structure-based chemistry data via the internet the ChemSpider platform hosts over 26 million unique chemical compounds aggregated from over 400 data sources and provides an environment for the community to both annotate and curate these existing data as well as deposit new data to the system. The search system delivers flexible querying capabilities together with links to external sites for publication and patent data. This presentation will review the present capabilities of the ChemSpider system providing direct examples of how to use the system to source high quality data of value to chemists. We will discuss some of the challenges associated with validating data quality and examine how ChemSpider is a part of the new “semantic web for chemistry”. ChemSpider has also spawned a number of additional projects include ChemSpider SyntheticPages for hosting openly peer-reviewed chemical synthesis articles, Learn Chemistry Wiki for students learning chemistry and SpectraSchool for learning spectroscopy.
The internet continues to offer increased access to chemistry data that may be of value to scientists interested in populating systems containing reference toxicology data as well as to provide data for the development of predictive models. This presentation will give an overview of some of the various sources of data available via the internet, provide an overview of some of the challenges associated with gathering high-quality data and discuss methods by which to mesh together disparate data sources.
ChemSpider is a structure centric database hosted by the Royal Society of Chemistry and integrating over 25 million chemical compounds to over 400 internet-based resources including many public domain databases, Wikipedia, chemical vendors, patents, publications and other web-based services. The intention is for ChemSpider to become one of the primary online hubs for chemists to source chemistry related data. During the development of the ChemSpider database we have utilized numerous approaches to standardizing, curating and validating the data supplied to us for hosting and integration. This presentation will provide an overview of our initial development of the ChemSpider database and provide an overview of our present processes and procedures for handling incoming data depositions. We will also discuss how crowdsourcing can help to expand, curate and validate the data on the ChemSpider database.
ChemSpider was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge. There are many tens of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them. Despite the diversity of databases available online their inherent quality, accuracy and completeness is lacking in many regards. ChemSpider was established to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data and experimental properties. ChemSpider has now grown into a database of well over 20 million chemical substances integrated with over 300 disparate data sources, many of these directly supporting the Life Sciences. This presentation will provide an overview of our efforts to improve the quality of data online, to provide a foundation for the semantic web for chemistry and to provide access to a set online tools and services to support access to these data. I will also discuss how ChemSpider is being used to enhance Semantic Publishing in Chemistry at RSC.
RSC|ChemSpider is one of the world’s largest online resources for chemistry related data and services. Developed with the intention of delivering access to structure-based chemistry data via the internet the ChemSpider platform hosts over 26 million unique chemical compounds aggregated from over 400 data sources and provides an environment for the community to both annotate and curate these existing data as well as deposit new data to the system. The search system delivers flexible querying capabilities together with links to external sites for publication and patent data. ChemSpider has spawned a number of projects include ChemSpider SyntheticPages for hosting openly peer-reviewed chemical synthesis articles. This presentation will review the present capabilities of the ChemSpider system providing direct examples of how to use the system to source high quality data of value to pharmaceutical companies. We will discuss some of the challenges associated with validating data quality, examine how ChemSpider is a part of the semantic web for chemistry and investigate approaches to using ChemSpider integrated to analytical instrumentation.
ChemSpider was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge. There are many tens of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them. Despite the diversity of databases available online their inherent quality, accuracy and completeness is lacking in many regards. ChemSpider was established to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data and experimental properties. ChemSpider has now grown into a database of over 20 million chemical substances integrated with over 300 disparate data sources, many of these directly supporting the Life Sciences. This presentation will provide an overview of our efforts to improve the quality of data online, to provide a foundation for the semantic web for chemistry and to provide access to a set online tools and services to support access to these data. I will also discuss how ChemSpider is being used to enhance Semantic Publishing in Chemistry at RSC.
The internet has provided access to unprecedented quantities of data. In the domain of chemistry specifically over the past decade the web has become populated with tens of millions of chemical structures and related properties of assays together with tens of thousands of spectra and syntheses. The data have, to a large extent, remained disparate and disconnected. In recent years with the wave of Web 2.0 participation any chemist can contribute to both the sharing and validation of chemistry-related data whether it be via Wikipedia, the online encyclopedia, or one of the multiple public compound databases. The presentation will offer a perspective of what is available today, our experiences of building a public compound database to link together the internet and a suggested path forward for enabling even greater integration and connectivity for chemistry data for the masses to both use and participate in developing.
RSC|ChemSpider is one of the world’s largest online resources for chemistry related data and services. Developed with the intention of delivering access to structure-based chemistry data via the internet the ChemSpider platform hosts over 26 million unique chemical compounds aggregated from over 400 data sources and provides an environment for the community to both annotate and curate these existing data as well as deposit new data to the system. The search system delivers flexible querying capabilities together with links to external sites for publication and patent data. This presentation will review the present capabilities of the ChemSpider system providing direct examples of how to use the system to source high quality data of value to chemists. We will discuss some of the challenges associated with validating data quality and examine how ChemSpider is a part of the new “semantic web for chemistry”. ChemSpider has also spawned a number of additional projects include ChemSpider SyntheticPages for hosting openly peer-reviewed chemical synthesis articles, Learn Chemistry Wiki for students learning chemistry and SpectraSchool for learning spectroscopy.
The internet continues to offer increased access to chemistry data that may be of value to scientists interested in populating systems containing reference toxicology data as well as to provide data for the development of predictive models. This presentation will give an overview of some of the various sources of data available via the internet, provide an overview of some of the challenges associated with gathering high-quality data and discuss methods by which to mesh together disparate data sources.
ChemSpider is a structure centric database hosted by the Royal Society of Chemistry and integrating over 25 million chemical compounds to over 400 internet-based resources including many public domain databases, Wikipedia, chemical vendors, patents, publications and other web-based services. The intention is for ChemSpider to become one of the primary online hubs for chemists to source chemistry related data. During the development of the ChemSpider database we have utilized numerous approaches to standardizing, curating and validating the data supplied to us for hosting and integration. This presentation will provide an overview of our initial development of the ChemSpider database and provide an overview of our present processes and procedures for handling incoming data depositions. We will also discuss how crowdsourcing can help to expand, curate and validate the data on the ChemSpider database.
With an intention to provide a high quality free internet resource of chemistry related data for the community, ChemSpider has aggregated almost 25 million compounds linked out to over 400 data sources and provided a platform for the community to both deposit and curate data. This experiment in crowdsourcing for chemistry has now been running for over three years. This presentation will review a number of aspects of the project including (a) the level of community participation in depositing and curating data; (b) the nature of data and content supplied by the community; (c) how ChemSpider is used by the community; (d) using game-based systems to assist in data curation; (e) algorithmic-based approaches to data validation and filtering; and (f) sharing data curation efforts with other online databases.
This is a presentation given in Track 4, Open Access and Cheminformatics, at the Bio-IT Meeting in Boston on April 21st 2010. It is a general overview of ChemSpider activities to link together the internet for chemists and validate and curate data. We won the Bio-IT Best Practices Community Service Award that evening also.
The patent literature has historically been complex and inaccessible to searches required for effective IP management and maintenance of a competitive position, particularly when it comes to chemical structure information. The availability of raw patent text feeds in a structured form have allowed the application of text-to-structure and image-to-structure conversion techniques. The problem then became one of applying this solution across massive data sets in an accurate and scalable manner to deliver a turnkey patent informatics system with automatically extracted, and searchable chemical structures. SureChem, an advanced cloud application, uses a tournament of methods to achieve higher coverage and accuracy than any single approach. This product was launched and licensed by a user community with a freemium business model. Latterly, user feedback and market shifts indicated a need to link biological data into patents too (sequences, genes, targets, diseases, etc). This created an opportunity to transition SureChem to EMBL-EBI, a public organisation with the remit of data dissemination and sharing, and deep experience of biodata, including the large ChEMBL database of Structure Activity Relationship Data. In 2014 SureChem became SureChEMBL. The presentation will review the development of SureChem, discuss the marketplace for patent informatics, and look ahead to future development plans for SureChEMBL.
The internet now offers access to a myriad of online resources that can be of value to chemists working in the Life Sciences. While finding information online is, in many cases, a simple search away, the accuracy and validity of the associated data and information should be questioned. As more databases and resources are introduced online, and commonly not integrated to other resources, a scientist must perform multiple searches and then undertake the task of meshing and merging data. ChemSpider is a freely accessible online database that has taken on the challenge of meshing together distributed resources across the internet to provide a structure-based hub. It is a crowdsourcing environment hosting over 26 million unique compounds linked out to over 400 data sources. With well defined programming interfaces for integration ChemSpider has been integrated to many commercial and open software packages and is presently serving as the chemistry foundation for the IMI Open PHACTS project.
This was a presentation I gave to an audience at Nature Publishing Group in New York on May 7th 2009. It's a long presentation and over an hour in length. Not much new here relative to other presentations...just a knitting together of many of the others on here.
There is an increasing availability of free and open access resources for scientists to use on the internet. Coupled with an increasing number of Open Source software programs we are in the middle of a revolution in data availability and tools to manipulate these data. ChemSpider is a free access website built with the intention of providing a structure centric community for chemists. As an aggregator of chemistry related information from many sources, at present over 21.5 million unique chemical entities from over 190 separate data sources, ChemSpider has taken on the task of both robotically and manually integrating and curating publicly available data sources. ChemSpider has also provided an environment for users to deposit, curate and annotate chemistry-related information. This has allowed the community to enhance ChemSpider by adding analytical data, associating synthetic pathways and publications and connecting to social networking resources. I will discuss how ChemSpider is fast becoming the premier curated platform and centralized hub for resourcing information about chemical entities and how the platform provides the foundation data for services allowing the analysis of analytical data and collaborative science.
There is an increasing availability of free and open access resources for chemists to use on the internet. Coupled with the increasing availability of Open Source software tools we are in the middle of a revolution in data availability and tools to manipulate these data. ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. It was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge.
There are tens if not hundreds of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them. Despite the fact that there were a large number of databases containing chemical compounds and data available online their inherent quality, accuracy and completeness was lacking in many regards. The intention with ChemSpider was to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data, experimental properties and linking to other valuable resources. It has grown into a resource containing over 21 million unique chemical structures from over 200 data sources.
ChemSpider has enabled real time curation of the data, association of analytical data with chemical structures, real-time deposition of single or batch chemical structures (including with activity data) and transaction-based predictions of physicochemical data. The social community aspects of the system demonstrate the potential of this approach. Curation of the data continues daily and thousands of edits and depositions by members of the community have dramatically improved the quality of the data relative to other public resources for chemistry.
This presentation will provide an overview of the history of ChemSpider, the present capabilities of the platform and how it can become one of the primary foundations of the semantic web for chemistry. It will also discuss some of the present projects underway since the acquisition of ChemSpider by the Royal Society of Chemistry.
The increasing availability of free and open access resources for scientists on the internet presents us with a revolution in data availability. The Royal Society of Chemistry hosts ChemSpider, a free access website for chemists built with the intention of building community for chemists (http://www.chemspider.com/).
ChemSpider is an aggregator of chemistry related information, at present over 20 million unique chemical entities linked out to over 300 separate data sources, ChemSpider has taken on the task of both robotically and manually curating publicly available data sources. It is also a public deposition platform where chemists can deposit their own data including novel structures, analytical data, synthesis procedures and host data associated with the growing activities associated with Open Notebook Science.
This presentation will examine chemistry on the internet, the dubious quality of what is available and how the ChemSpider crowdsourced curation platform is fast becoming one of the centralized hubs for resourcing information about chemical entities.
We will also review our efforts to provide free resources for synthesis procedures, spectral data and structure-based searching of the chemistry literature and how chemists can contribute directly to each of these projects.
ICAR 2015
Workshop 10 (TUESDAY, JULY 7, 2015, 4:30-6:00 PM)
The Arabidopsis information portal for users and developers
Agnes Chan (J. Craig Venter Institute)
A Guided Tour of Araport
The Royal Society of Chemistry hosts one of the largest online chemistry databases containing almost 30 million unique chemical structures. The database, ChemSpider, provides the underpinning for a series of eScience projects allowing for the integration of chemical compounds with our archive of scientific publications, the delivery of a reaction database containing millions of reactions as well as a chemical validation and standardization platform developed to help improve the quality of structural representations on the internet. The InChI has been a fundamental part of each of our projects and has been pivotal in our support of international projects such as the Open PHACTS semantic web project integrating chemistry and biology data and the PharmaSea project focused on identifying novel chemical components from the ocean with the intention of identifying new antibiotics. This presentation will provide an overview of the importance of InChI in the development of many of our eScience platforms and how we have used it specifically in the ChemSpider project to provide integration across hundreds of websites and chemistry databases across the web. We will discuss how we are now expanding our efforts to develop a Global Chemistry Network encompassing efforts in Open Source Drug Discovery and the support of data management for neglected diseases.
This is the presentation I gave at OpenSciNY 2010. It was a great gathering of Librarians and people interested in Open Science. Sharing the stage with Beth Brown Jean-Claude Bradley and Heather Joseph was, as usual, a good opportunity to discuss how openness and online data sharing is changing the way we access and share data. We live in interesting and exciting times.
The Royal Society of Chemistry (RSC) is a major participant in providing access to chemistry related data via the web. As an internationally renowned society for the chemical sciences, a scientific publisher and the host of the ChemSpider database for the community, RSC continues to make dramatic strides in providing online access to data. ChemSpider provides access to over 30 million chemicals sourced from over 500 data suppliers and linked out to related information on the web. The platform is a crowdsourcing environment whereby members of the community can participate in validating and expanding the content of the database. With a set of application programming interfaces ChemSpider is used by various organizations and projects to serve up data for various purposes. These include structure identification for mass spectrometry instrument vendors, RSC databases such as the Marinlit natural products database and a European grant-based project from the Innovative Medicines Initiative fund. This presentation will provide an overview of various cheminformatics activities and projects that RSC is involved with to serve the medicinal chemistry community. This will include the Open PHACTS semantic web project, the PharmaSea project to identify new pharmaceutical leads from the ocean and the UK National Compound Collection to identify new lead compounds contained within PhD theses.
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...Araport
PMR database is a community resource for deposition and analysis of metabolomics data and related transcriptomics data. PMR currently houses metabolomics data from over 25 species of eukaryotes. In this talk, we introduce PMRs RESTful web APIs for data sharing, and demonstrate its applications in research using Araport to provide Arabidopsis metabolomics data.
ICIC 2017: Freeware and public databases: Towards a Wiki Drug Discovery?Dr. Haxel Consult
Fernando Huerta (RISE Bioscience & Materials, SE)
Alexander Minidis (Collaborative Drug Discovery - CDD VAULT, Sweden)
How much information does the scientists need to design new potential drugs?
A thorough overview of public scientific information sources (open access) and methods to collect, process, analyse and visualize this information will be presented. A direct application of such free available information in conjunction with freeware will be described in relation with the efforts of the scientific community to find effective medicines for the ZIKA virus.
This is a presentation given at the European Informatics Institute (EBI), in Cambridge on December 1st 2010. This was at an EMBL-EBI Industry Program Workshop regarding "Chemical Structure Resources". This is where I unveiled details regarding the intra/inter-validation studies validating drug structures on multiple public domain chemistry databases. I also unveiled early results regarding the SurveyMonkey study of "trust" that the community has about public domain chemistry resources
There is an increasing availability of free and open access resources for scientists to use on the internet. Coupled with the increasing availability of Open Source software tools we are in the middle of a revolution in data availability and tools to manipulate these data. ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. As an aggregator of chemistry related information from many sources, at present over 21.5 million unique chemical entities from over 200 separate data sources, ChemSpider has taken on the task of both robotically and manually curating publicly available data sources. This presentation will provide an overview of the ChemSpider platform and how it is fast becoming the centralized hub for resourcing information about chemical entities.
With an intention to provide a high quality free internet resource of chemistry related data for the community, ChemSpider has aggregated almost 25 million compounds linked out to over 400 data sources and provided a platform for the community to both deposit and curate data. This experiment in crowdsourcing for chemistry has now been running for over three years. This presentation will review a number of aspects of the project including (a) the level of community participation in depositing and curating data; (b) the nature of data and content supplied by the community; (c) how ChemSpider is used by the community; (d) using game-based systems to assist in data curation; (e) algorithmic-based approaches to data validation and filtering; and (f) sharing data curation efforts with other online databases.
This is a presentation given in Track 4, Open Access and Cheminformatics, at the Bio-IT Meeting in Boston on April 21st 2010. It is a general overview of ChemSpider activities to link together the internet for chemists and validate and curate data. We won the Bio-IT Best Practices Community Service Award that evening also.
The patent literature has historically been complex and inaccessible to searches required for effective IP management and maintenance of a competitive position, particularly when it comes to chemical structure information. The availability of raw patent text feeds in a structured form have allowed the application of text-to-structure and image-to-structure conversion techniques. The problem then became one of applying this solution across massive data sets in an accurate and scalable manner to deliver a turnkey patent informatics system with automatically extracted, and searchable chemical structures. SureChem, an advanced cloud application, uses a tournament of methods to achieve higher coverage and accuracy than any single approach. This product was launched and licensed by a user community with a freemium business model. Latterly, user feedback and market shifts indicated a need to link biological data into patents too (sequences, genes, targets, diseases, etc). This created an opportunity to transition SureChem to EMBL-EBI, a public organisation with the remit of data dissemination and sharing, and deep experience of biodata, including the large ChEMBL database of Structure Activity Relationship Data. In 2014 SureChem became SureChEMBL. The presentation will review the development of SureChem, discuss the marketplace for patent informatics, and look ahead to future development plans for SureChEMBL.
The internet now offers access to a myriad of online resources that can be of value to chemists working in the Life Sciences. While finding information online is, in many cases, a simple search away, the accuracy and validity of the associated data and information should be questioned. As more databases and resources are introduced online, and commonly not integrated to other resources, a scientist must perform multiple searches and then undertake the task of meshing and merging data. ChemSpider is a freely accessible online database that has taken on the challenge of meshing together distributed resources across the internet to provide a structure-based hub. It is a crowdsourcing environment hosting over 26 million unique compounds linked out to over 400 data sources. With well defined programming interfaces for integration ChemSpider has been integrated to many commercial and open software packages and is presently serving as the chemistry foundation for the IMI Open PHACTS project.
This was a presentation I gave to an audience at Nature Publishing Group in New York on May 7th 2009. It's a long presentation and over an hour in length. Not much new here relative to other presentations...just a knitting together of many of the others on here.
There is an increasing availability of free and open access resources for scientists to use on the internet. Coupled with an increasing number of Open Source software programs we are in the middle of a revolution in data availability and tools to manipulate these data. ChemSpider is a free access website built with the intention of providing a structure centric community for chemists. As an aggregator of chemistry related information from many sources, at present over 21.5 million unique chemical entities from over 190 separate data sources, ChemSpider has taken on the task of both robotically and manually integrating and curating publicly available data sources. ChemSpider has also provided an environment for users to deposit, curate and annotate chemistry-related information. This has allowed the community to enhance ChemSpider by adding analytical data, associating synthetic pathways and publications and connecting to social networking resources. I will discuss how ChemSpider is fast becoming the premier curated platform and centralized hub for resourcing information about chemical entities and how the platform provides the foundation data for services allowing the analysis of analytical data and collaborative science.
There is an increasing availability of free and open access resources for chemists to use on the internet. Coupled with the increasing availability of Open Source software tools we are in the middle of a revolution in data availability and tools to manipulate these data. ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. It was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge.
There are tens if not hundreds of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them. Despite the fact that there were a large number of databases containing chemical compounds and data available online their inherent quality, accuracy and completeness was lacking in many regards. The intention with ChemSpider was to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data, experimental properties and linking to other valuable resources. It has grown into a resource containing over 21 million unique chemical structures from over 200 data sources.
ChemSpider has enabled real time curation of the data, association of analytical data with chemical structures, real-time deposition of single or batch chemical structures (including with activity data) and transaction-based predictions of physicochemical data. The social community aspects of the system demonstrate the potential of this approach. Curation of the data continues daily and thousands of edits and depositions by members of the community have dramatically improved the quality of the data relative to other public resources for chemistry.
This presentation will provide an overview of the history of ChemSpider, the present capabilities of the platform and how it can become one of the primary foundations of the semantic web for chemistry. It will also discuss some of the present projects underway since the acquisition of ChemSpider by the Royal Society of Chemistry.
The increasing availability of free and open access resources for scientists on the internet presents us with a revolution in data availability. The Royal Society of Chemistry hosts ChemSpider, a free access website for chemists built with the intention of building community for chemists (http://www.chemspider.com/).
ChemSpider is an aggregator of chemistry related information, at present over 20 million unique chemical entities linked out to over 300 separate data sources, ChemSpider has taken on the task of both robotically and manually curating publicly available data sources. It is also a public deposition platform where chemists can deposit their own data including novel structures, analytical data, synthesis procedures and host data associated with the growing activities associated with Open Notebook Science.
This presentation will examine chemistry on the internet, the dubious quality of what is available and how the ChemSpider crowdsourced curation platform is fast becoming one of the centralized hubs for resourcing information about chemical entities.
We will also review our efforts to provide free resources for synthesis procedures, spectral data and structure-based searching of the chemistry literature and how chemists can contribute directly to each of these projects.
ICAR 2015
Workshop 10 (TUESDAY, JULY 7, 2015, 4:30-6:00 PM)
The Arabidopsis information portal for users and developers
Agnes Chan (J. Craig Venter Institute)
A Guided Tour of Araport
The Royal Society of Chemistry hosts one of the largest online chemistry databases containing almost 30 million unique chemical structures. The database, ChemSpider, provides the underpinning for a series of eScience projects allowing for the integration of chemical compounds with our archive of scientific publications, the delivery of a reaction database containing millions of reactions as well as a chemical validation and standardization platform developed to help improve the quality of structural representations on the internet. The InChI has been a fundamental part of each of our projects and has been pivotal in our support of international projects such as the Open PHACTS semantic web project integrating chemistry and biology data and the PharmaSea project focused on identifying novel chemical components from the ocean with the intention of identifying new antibiotics. This presentation will provide an overview of the importance of InChI in the development of many of our eScience platforms and how we have used it specifically in the ChemSpider project to provide integration across hundreds of websites and chemistry databases across the web. We will discuss how we are now expanding our efforts to develop a Global Chemistry Network encompassing efforts in Open Source Drug Discovery and the support of data management for neglected diseases.
This is the presentation I gave at OpenSciNY 2010. It was a great gathering of Librarians and people interested in Open Science. Sharing the stage with Beth Brown Jean-Claude Bradley and Heather Joseph was, as usual, a good opportunity to discuss how openness and online data sharing is changing the way we access and share data. We live in interesting and exciting times.
The Royal Society of Chemistry (RSC) is a major participant in providing access to chemistry related data via the web. As an internationally renowned society for the chemical sciences, a scientific publisher and the host of the ChemSpider database for the community, RSC continues to make dramatic strides in providing online access to data. ChemSpider provides access to over 30 million chemicals sourced from over 500 data suppliers and linked out to related information on the web. The platform is a crowdsourcing environment whereby members of the community can participate in validating and expanding the content of the database. With a set of application programming interfaces ChemSpider is used by various organizations and projects to serve up data for various purposes. These include structure identification for mass spectrometry instrument vendors, RSC databases such as the Marinlit natural products database and a European grant-based project from the Innovative Medicines Initiative fund. This presentation will provide an overview of various cheminformatics activities and projects that RSC is involved with to serve the medicinal chemistry community. This will include the Open PHACTS semantic web project, the PharmaSea project to identify new pharmaceutical leads from the ocean and the UK National Compound Collection to identify new lead compounds contained within PhD theses.
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...Araport
PMR database is a community resource for deposition and analysis of metabolomics data and related transcriptomics data. PMR currently houses metabolomics data from over 25 species of eukaryotes. In this talk, we introduce PMRs RESTful web APIs for data sharing, and demonstrate its applications in research using Araport to provide Arabidopsis metabolomics data.
ICIC 2017: Freeware and public databases: Towards a Wiki Drug Discovery?Dr. Haxel Consult
Fernando Huerta (RISE Bioscience & Materials, SE)
Alexander Minidis (Collaborative Drug Discovery - CDD VAULT, Sweden)
How much information does the scientists need to design new potential drugs?
A thorough overview of public scientific information sources (open access) and methods to collect, process, analyse and visualize this information will be presented. A direct application of such free available information in conjunction with freeware will be described in relation with the efforts of the scientific community to find effective medicines for the ZIKA virus.
This is a presentation given at the European Informatics Institute (EBI), in Cambridge on December 1st 2010. This was at an EMBL-EBI Industry Program Workshop regarding "Chemical Structure Resources". This is where I unveiled details regarding the intra/inter-validation studies validating drug structures on multiple public domain chemistry databases. I also unveiled early results regarding the SurveyMonkey study of "trust" that the community has about public domain chemistry resources
There is an increasing availability of free and open access resources for scientists to use on the internet. Coupled with the increasing availability of Open Source software tools we are in the middle of a revolution in data availability and tools to manipulate these data. ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. As an aggregator of chemistry related information from many sources, at present over 21.5 million unique chemical entities from over 200 separate data sources, ChemSpider has taken on the task of both robotically and manually curating publicly available data sources. This presentation will provide an overview of the ChemSpider platform and how it is fast becoming the centralized hub for resourcing information about chemical entities.
RSC|ChemSpider is one of the world’s largest online resources for chemistry related data and services. Developed with the intention of delivering access to structure-based chemistry data via the internet the ChemSpider platform hosts over 26 million unique chemical compounds aggregated from over 400 data sources and provides an environment for the community to both annotate and curate these existing data as well as deposit new data to the system. The search system delivers flexible querying capabilities together with links to external sites for publication and patent data. This presentation will review the present capabilities of the ChemSpider system providing direct examples of how to use the system to source high quality data of value to chemists. We will discuss some of the challenges associated with validating data quality and examine how ChemSpider is a part of the new “semantic web for chemistry”. ChemSpider has also spawned a number of additional projects include ChemSpider SyntheticPages for hosting openly peer-reviewed chemical synthesis articles, Learn Chemistry Wiki for students learning chemistry and SpectraSchool for learning spectroscopy.
These are the slides I will be giving here at the Science Commons Symposium Pacific Northwest at the Microsoft Campus here in Redmond in about 5 minutes time
The original abstract for the talk is below BUT the talk changed based on a big interest in InChI and the possibilities to use in a Semantic Web for Chemistry
The increasing availability of free and open access resources for scientists on the internet presents us with a revolution in data availability. However, freedom costs and in many cases the cost is quality. ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. As an aggregator of chemistry related information from many sources, at present over 21.5 million unique chemical entities from over 150 separate data sources, ChemSpider has taken on the task of both robotically and manually curating publicly available data sources. This presentation will provide an overview of how a curated platform can become the centralized hub for resourcing information about chemical entities. We will also present ChemMantis, an entity extraction platform for extracting chemical names and scientific terms in documents and providing a platform for structure-based searching of Open Access chemistry literature.
This is a presentation I gave at the FDA on December 1st 2009 in Wahington DC as part of a symposium involving PubChem, ChemIDPLus, PillBox, DailyMed and other related systems. The focus was, as usual, on the quality of data online and how to clean up the information and with a specific focus on the quality of data on the FDA's DailyMed and our efforts to apply semantic markup to the DailyMed articles
The ability to query across a chemistry publishers content using chemical structure searching can dramatically enhance discoverability. RSC has been applying a number of procedures to integrate RSC’s ChemSpider community resource with our published content and databases. These include: 1) entity extraction procedures 2) chemical name conversion procedures using software algorithms and curated dictionaries 3) semantic markup and 4) a crowdsourced curation processes. This presentation will provide an overview of the processes we have utilized in order to provide structure-based integration to RSC content. We will discuss our ongoing efforts to extend the approaches to the mining of data from the rich supplementary information sections of many RSC publications. Our intention is to provide access to synthesis procedures and analytical data and further enrich the ChemSpider database for the benefit of the chemistry community.
Scientists commonly find themselves in a state of overwhelm in regards to the availability of information accessible to them. The distribution of resources now includes the entire space of the worldwide web, access to primary databases such as CAS and, commonly, a plethora of internally developed systems. While the web has provided improved access to chemistry-related information there has not been an online central resource allowing integrated chemical structure-searching of chemistry databases, chemistry articles, patents and web pages such as blogs and wikis. ChemSpider has built a structure centric community for chemists by providing free access to an online database and collaboration tool for chemists. The online database offers an environment for curating the data on ChemSpider as well as the deposition of chemical structures, analytical data and associated information and provides a significant knowledge base and resource for chemists working in different domains. An overview of present and future capabilities is given.
This is a presentation given at the Opal Events meeting ""Drug Discovery Partnerships: Filling the Pipeline". I was speaking in a session with Jean-Claude Bradley regarding "Pre-competitive Collaboration: Sharing Data to Increase Predictability". This presentation discussed some of the work we are doing on Open PHACTS. My thanks especially to Carole Goble, Lee Harland and Sean Ekins for their comments.
ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. It was developed to index available sources of chemical structures and their associated data into a single searchable repository and making it available to everybody, at no charge. While there are a large number of databases containing chemical compounds and data available online their inherent quality, accuracy and completeness is severely lacking. ChemSpider has provided a platform so that the chemistry community could contribute to improving the quality of data online and expanding the information to include data such as reaction syntheses, analytical data, experimental properties and linkages to other valuable resources. It has grown into a resource containing over 21 million unique chemical structures from over 200 data sources.
This presentation will provide an overview of ChemSpider and its value to chemists as a search tool, as a public repository of information and how it can become one of the primary foundations of internet-based chemistry. I will also discuss the vision for ChemSpider and some of the lofty goals we are setting for the system moving forward.
In recent years, in parallel with the general broad trend of information proliferation, many tens of public chemical databases have been created and made available using internet technologies. In many cases fluent data exchange has occurred between these various databases as they source information from one another. While this has the advantages of linking together multiple data sources the results also include the proliferation of errors across the various databases. The lack of a public authority to resolve such errors significantly affects the quality of freely accessible chemical information. While ChemSpider has previously allowed a crowdsourcing approach to curation efforts have now migrated to addressing this problem using a "federated resolver" approach. This presentation will report on our work in this area.
The Internet is the world’s publicly accessible container for a myriad of resources containing chemistry related data. Whether it be collections of millions of chemical compounds with their associated properties, interactive displays for analytical data, access to publications and patents or tapping into the increasing availability of online computational engines, the web has became the primary enabling technology to source information and data. Scientists collectively applaud and utilize the availability of such resources and an increasing proportion of the community are willing to support these resources by contributing both their data and skills to help curate and validate information on the web. This “crowdsourcing” has started to contribute large amounts of data to the commons and serves has a valuable platform for reference and, potentially, discovery.
ChemSpider is one of the chemistry community’s primary online resources and allows scientists to search across 25 million unique chemical compounds linked out to over 400 original data sources and has become a central hub for searching for chemistry-related data. The platform however offers much more to the community and has become a central repository for analytical data, specifically spectra, is a host for community-authored chemical syntheses and facilitates data curation and annotation by any of its users. This presentation will provide an overview of the ChemSpider platform in terms of available data and its efforts to act as a public repository and clearing ground for data curation. We will discuss how such a platform, when coupled with game-based approaches, facilitates both teaching and data validation and will discuss whether public domain resources such as ChemSpider will ultimately become authorities for chemistry.
I am an adjunct prof at University of North Carolina Chapel Hill so when I stopped by yesterday for a business meeting I was informed that I had been lined up to give a talk to the students at 1pm. I had 20 minutes to prepare and assembled a mish-mash of information that might be of value to Citizen Chemists, those who might want to contribute to chemistry on the internet
Web-based technologies coupled with a drive for improved communication between scientists have resulted in the proliferation of scientific opinion, data and knowledge at an ever-increasing rate. The increasing array of chemistry-related computer-based resources now available provides chemists with a direct path to the discovery of information, once previously accessed via library services and limited to commercial and costly resources. We propose that preclinical absorption, distribution, metabolism, excretion and toxicity data as well as pharmacokinetic properties from studies published in the literature (which use animal or human tissues in vitro or from in vivo studies) are precompetitive in nature and should be freely available on the web. This could be made possible by curating the literature and patents, data donations from pharmaceutical companies and by expanding the currently freely available ChemSpider database of over 21 million molecules with physicochemical properties. This will require linkage to PubMed, PubChem and Wikipedia as well as other frequently used public databases that are currently used, mining the full text publications to extract the pertinent experimental data. These data will need to be extracted using automated and manual methods, cleaned and then published to the ChemSpider or other database such that it will be freely available to the biomedical research and clinical communities. The value of the data being accessible will improve development of drug molecules with good ADME/Tox properties, facilitate computational model building for these properties and enable researchers to not repeat the failures of past drug discovery studies.
Presented by Richard Kidd at "The Future Information Needs of Pharmaceutical & Medicinal Chemistry", Monday 28 November 2011 at The Linnean Society, Burlington Square, London run by the RSC CICAG group.
This is a general presentation about our efforts to build an internet based community for chemists using ChemSpider. A general overview of data quality online, crowdsourced deposition and curation and our progress to deliver a solution to the community for resourcing data.
Similar to ChemSpider – An Online Database and Registration System Linking the Web (20)
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Essentials of Automations: The Art of Triggers and Actions in FME
ChemSpider – An Online Database and Registration System Linking the Web
1. ChemSpider – An Online Database and Registration System Linking the Web Antony Williams and Valery Tkachenko EBI Chemical Registry Systems Workshop, October 2011