Navigating an Internet of Chemistry via ChemSpider

•Download as PPT, PDF•

0 likes•1,011 views

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

This is a presentation I gave via the BigBlueButton system to students and faculty at the University of Arkansas, Little Rock, regarding searching the internet for Chemistry.

Technology Education

Navigating an Internet of Chemistry via ChemSpider Antony Williams University of Arkansas, Little Rock, October 2011 UALR Chemistry Seminar Guest Lecture

Overview ,[object Object],[object Object],[object Object],[object Object],[object Object]

Where is chemistry online? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Molfiles ,[object Object],[object Object],[object Object],[object Object],[object Object]

Molfiles ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

SMILES ( http://en.wikipedia.org/wiki/SMILES ) ,[object Object],[object Object],[object Object],[object Object]

Vendor-dependent SMILES ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

InChI ,[object Object],[object Object],[object Object]

Checking for Stereochemistry Use your drawing package!

Vancomycin Search Molecular SKELETON Search Full Molecule

Searching Chemistry on the Internet ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Searching Chemistry on the Internet ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Quality on the Internet ,[object Object]

What’s said on the web is true… ,[object Object],[object Object]

Contributing Chemistry to the Web ,[object Object]

Contributing Chemistry to the Web ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Contributing Chemistry to the Web ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Submission Process ,[object Object],[object Object],[object Object],[object Object],[object Object]

Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Thank you Email: williamsa@rsc.org Twitter: ChemConnector Blog: www.chemspider.com/blog Personal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams

The Royal Society of Chemistry hosts one of the worlds’ richest collections of online chemistry data that is free-to-access for the community. ChemSpider presently hosts over 30 million unique chemical compounds together with associated data and accessible via a number of search techniques. With almost 50,000 unique users per day from around the world the site offers scientists the ability to investigate the world of small molecules via property searches, analytical data and predictive models. The challenges associated with providing a similar platform for “materials” are manifold but, if they could be addressed, would offer a valuable service to the materials community. This presentation will provide an overview of how ChemSpider was built, our efforts to expand the capabilities to a more encompassing data repository and some of the challenges faced to embrace the diverse world of materials informatics and online data access.

Chemicals, Chemical Identifiers and Navigating Through Databases

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

This is a presentation given to a group of students at the UNC Eshelman School of Pharmacy. As chemists many of us want to resource information that is high quality, accurate and addresses our query. With the increasing proliferation of online chemistry resources it is very common for us to turn to these resources to source data. However, are resources such as Wikipedia, PubChem and the plethora of databases delivering information for metabolism, medicinal chemistry and synthetic chemistry trustworthy? Which of these resources, if any, should be treated as authorities? What is the most integrated approach to resource chemistry related data online? What approaches can be taken to validate the data that is available and how can individual scientists participate in helping to improve the content and quality of chemistry related data on the web. Antony Williams is ChemSpiderman. He started the ChemSpider database (www.chemspider.com) as a hobby to deliver a free platform for the community to source chemistry related data. Within three years the system was acquired by the Royal Society of Chemistry and now serves up close to 25 million chemical structures linked to over 400 data sources across the internet and offers individual scientists the opportunity to host and share their data with the community and to participate in data curation and annotation. Tony will share his experiences of building this chemistry database with a focus on data validation and curation and sourcing high quality data. During the presentation he will discuss ways to check chemical structure representations before submission to public systems for searching and provide an overview of chemical identifiers such as SMILES strings and the International Chemical Identifier (InChI) allows for the interlinking of resources. Attendees can expect to leave the session with a deeper understanding of utilizing the internet to resource chemistry related data.

Online Public Compound Databases

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Navigating the Complex Web of Chemistry Using ChemSpider

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

The internet has revolutionized the sharing of data and information and in the domain of chemistry there are many resources available to help with our research. In recent years various online resources have been introduced that allow users to access information, properties and data associated with chemical entities. At a time when CAS has declared that they now have over 50 million unique chemical entities in the registry the number of chemical structures distributed across the internet also measures in the tens of millions. There are many tens of databases on the internet hosting chemical structures associated with data focused on the specific nature of the collection – metabolic pathways, spectral data collections, chemical vendor collections, biological assay data and crystal structures are examples. Unfortunately there has been no single way to search across all of these resources. ChemSpider has taken on the task of integrating the multiple online resources of information into a single database using the chemical structure as the primary key and retaining the link out and attribution to the original datasource. In this manner ChemSpider intends to become a structure-centric hub for the chemistry community. This talk will provide an overview of the ChemSpider platform, how it is being used as a crowdsourcing platform for community-based curation of the data and the future vision of ChemSpider as one of the pillars of the semantic web of chemistry.

Crawling Across the Web of Chemistry Using ChemSpider

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. It was developed to index available sources of chemical structures and their associated data into a single searchable repository and making it available to everybody, at no charge. While there are a large number of databases containing chemical compounds and data available online their inherent quality, accuracy and completeness is severely lacking. ChemSpider has provided a platform so that the chemistry community could contribute to improving the quality of data online and expanding the information to include data such as reaction syntheses, analytical data, experimental properties and linkages to other valuable resources. It has grown into a resource containing over 21 million unique chemical structures from over 200 data sources. This presentation will provide an overview of ChemSpider and its value to chemists as a search tool, as a public repository of information and how it can become one of the primary foundations of internet-based chemistry. I will also discuss the vision for ChemSpider and some of the lofty goals we are setting for the system moving forward.

Why Chemistry and the Web Will Benefit from a ChemSpider

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider is a free access website for chemists built with the vision of providing a structure centric community for chemists. Vision is great…execution is better. ChemSpider is now one of the internet’s primary portals for chemistry offering access to over 23 million unique chemical structures from over 200 data sources and expanding daily. Even though there are tens if not hundreds of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. there has been no single way to search across them. Despite the fact that there are a large number of databases containing chemical compounds and data available online their inherent quality, accuracy and completeness remains lacking in many regards. With ChemSpider we have provided a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data, experimental properties and linking to other valuable resources. This presentation will provide an overview of ChemSpider and its value to chemists as a search tool, as a public repository of information and how it can become one of the primary foundations of internet-based chemistry. I will also discuss the vision for ChemSpider and some of the exciting goals we are setting for the system moving forward.

This document discusses building an online profile as a scientist in the era of big data and open science. It begins with an overview of the speaker's background working in academia, industry, and as an entrepreneur. The speaker then discusses various online tools and platforms that scientists can use to share their work and expertise, such as ORCID, LinkedIn, Google Scholar, SlideShare, and ResearchGate. He emphasizes the importance of making contributions openly available online in order to increase visibility and measure impact through alternative metrics. The speaker also provides examples of using these tools to showcase his own career and publications.

RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

The document discusses the challenges of searching for chemistry information online and proposes a vision for improving chemistry searches through a structure-centric approach. It outlines ChemSpider's efforts to integrate chemistry data from various sources on the internet by linking them through chemical structures and InChI identifiers. This allows substructure and skeleton searches that can find more information than text searches. It encourages contributions from students and researchers to add chemical data, curate existing data, and help build ChemSpider into a comprehensive resource for chemistry information on the internet.

ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

There is an increasing availability of free and open access resources for chemists to use on the internet. Coupled with the increasing availability of Open Source software tools we are in the middle of a revolution in data availability and tools to manipulate these data. ChemSpider is a free access website for chemists built with the intention of providing a structure centric community for chemists. It was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge. There are tens if not hundreds of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them. Despite the fact that there were a large number of databases containing chemical compounds and data available online their inherent quality, accuracy and completeness was lacking in many regards. The intention with ChemSpider was to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data, experimental properties and linking to other valuable resources. It has grown into a resource containing over 21 million unique chemical structures from over 200 data sources. ChemSpider has enabled real time curation of the data, association of analytical data with chemical structures, real-time deposition of single or batch chemical structures (including with activity data) and transaction-based predictions of physicochemical data. The social community aspects of the system demonstrate the potential of this approach. Curation of the data continues daily and thousands of edits and depositions by members of the community have dramatically improved the quality of the data relative to other public resources for chemistry. This presentation will provide an overview of the history of ChemSpider, the present capabilities of the platform and how it can become one of the primary foundations of the semantic web for chemistry. It will also discuss some of the present projects underway since the acquisition of ChemSpider by the Royal Society of Chemistry.

ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider was developed with the intention of aggregating and indexing available sources of chemical structures and their associated information into a single searchable repository and making it available to everybody, at no charge. There are many tens of chemical structure databases such as literature data, chemical vendor catalogs, molecular properties, environmental data, toxicity data, analytical data etc. and no single way to search across them. Despite the diversity of databases available online their inherent quality, accuracy and completeness is lacking in many regards. ChemSpider was established to provide a platform whereby the chemistry community could contribute to cleaning up the data, improving the quality of data online and expanding the information available to include data such as reaction syntheses, analytical data and experimental properties. ChemSpider has now grown into a database of over 20 million chemical substances integrated with over 300 disparate data sources, many of these directly supporting the Life Sciences. This presentation will provide an overview of our efforts to improve the quality of data online, to provide a foundation for the semantic web for chemistry and to provide access to a set online tools and services to support access to these data. I will also discuss how ChemSpider is being used to enhance Semantic Publishing in Chemistry at RSC.

RSC ChemSpider Science Commons Symposium Pacific Northwest #scspn

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider - Does Community Engagement work to Build a Quality Online Resourc...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

The document discusses ChemSpider, a free online chemical database, and its efforts to engage the chemistry community to help build and curate its database. It describes ChemSpider's roles in hosting and exposing chemical data as well as curating submitted data. It acknowledges that while crowdsourcing engagement has been low, more collaboration across databases could help improve overall data quality. Continued growth will depend on better engaging the community to contribute to and help shape the resource.

A Presentation At Nature Publishing Group Crowdsourcing, Collaborations And T...

guest01a117

The document discusses using crowdsourcing and text mining to improve access to chemistry information on the internet. It describes ChemSpider, a search engine for chemical structures, properties, and information. ChemSpider aims to index chemistry articles and literature to make chemical information more accessible and searchable by structure. The challenges of aggregating and curating large amounts of chemical data from various sources are also discussed.

Taming The Wild West Of Internet Based Chemistry You Can Help

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Connecting Chemists to the Internet Through ChemSpider

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider Presentation At University Of Toronto

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

RSC ChemSpider is the online chemistry database where community contributions...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

The ChemSpider database is a resource hosted by the Royal Society of Chemistry. With over 28 million unique chemicals on the database linked out to over 400 data sources the platform provides access to experimental and predicted data (properties, spectra etc.), links to publications, patents and a myriad of other resources. The ChemSpider database has been used as the foundation of a number of other resources for chemists including ChemSpider SyntheticPages, the Learn Chemistry Wiki and the Spectral Game. This presentation will provide an overview of ChemSpider and discuss how chemists can both derive value from and contribute to the content available from the database and its related resources. We will also discuss our view of future platform for managing personal, institutional and public chemistry in a shared environment.

Whitney Symposium Lecture June 2008

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider and How The Wisdom Of The Crowds Can Improve The Quality Of ...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

This is a presentation I gave at the FDA on December 1st 2009 in Wahington DC as part of a symposium involving PubChem, ChemIDPLus, PillBox, DailyMed and other related systems. The focus was, as usual, on the quality of data online and how to clean up the information and with a specific focus on the quality of data on the FDA's DailyMed and our efforts to apply semantic markup to the DailyMed articles

Digital Chemical Representations

NextMove Software

The document discusses several issues related to digital chemical representations and InChI. It provides a brief history of chemical notation systems and discusses some limitations of InChI, including representing tautomers, polymers, and neutral component duplication. The document also notes ongoing work to address stereochemistry issues and support for experimental features like polymer representation. However, many challenges remain for InChI to fully represent the complexity of real-world chemicals.

ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

This was a presentation I gave to an audience at Nature Publishing Group in New York on May 7th 2009. It's a long presentation and over an hour in length. Not much new here relative to other presentations...just a knitting together of many of the others on here. There is an increasing availability of free and open access resources for scientists to use on the internet. Coupled with an increasing number of Open Source software programs we are in the middle of a revolution in data availability and tools to manipulate these data. ChemSpider is a free access website built with the intention of providing a structure centric community for chemists. As an aggregator of chemistry related information from many sources, at present over 21.5 million unique chemical entities from over 190 separate data sources, ChemSpider has taken on the task of both robotically and manually integrating and curating publicly available data sources. ChemSpider has also provided an environment for users to deposit, curate and annotate chemistry-related information. This has allowed the community to enhance ChemSpider by adding analytical data, associating synthetic pathways and publications and connecting to social networking resources. I will discuss how ChemSpider is fast becoming the premier curated platform and centralized hub for resourcing information about chemical entities and how the platform provides the foundation data for services allowing the analysis of analytical data and collaborative science.

Web Crawling Chemistry

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

- ChemSpider is a free online database of over 21.5 million chemical structures that can be searched by chemical structure, substructure, or text. It aims to make chemistry research openly accessible online. - ChemSpider indexes chemistry articles and allows them to be searched by chemical structure. It hosts open access articles and is developing tools for crowd-sourced annotation and curation of chemistry literature. - Key challenges include maintaining data quality at large scale, integrating with authoritative databases, and gaining support from all stakeholders in open data sharing versus closed systems.

RSC ChemSpider – Building An Internet Based Community For Chemists

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Building A Community Resource For The Life Sciences

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

A Presentation at Nature Publishing Group Crowdsourcing, Collaborations and T...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

The document discusses building a structure-centric community for chemists by leveraging crowdsourcing and text-mining of open chemistry data on the internet. It describes ChemSpider's capabilities to search and aggregate chemical data from various sources by structure and property and its efforts to curate and link open access literature and patents to chemical structures. Challenges around data quality and ambiguity in chemical names are also covered. The goal is to enable new ways of searching chemistry information centered around chemical structures.

AZ of Chemspider February 2011

Royal Society of Chemistry

The document summarizes the vision and challenges of ChemSpider, a free online database for chemists. Key points: - ChemSpider aims to connect chemistry online by allowing searches by chemical structure and linking to related data across the web. - It was built as a "hobby project" on limited resources but has grown significantly. - Ensuring data quality is a major challenge due to errors inherited from other databases. Extensive curation is needed. - Name searching is problematic, structure and substructure searching is preferred. - Future work includes continued curation, improved search capabilities, and collaborative data cleaning across databases.

Biomedical Knowledge Graphs for Data Scientists and Bioinformaticians

Neo4j

Choosing The Best AWS Service For Your Website + API.pptx

Brandon Minnick, MBA

Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API? Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose? Which one is cheapest? Which one is fastest? Which one will scale to meet our needs? Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!

Similar to Navigating an Internet of Chemistry via ChemSpider

Data integration and building a profile for yourself as an online scientist

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider - Building a Foundation for the Semantic Web by Hosting a Crowd So...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

RSC ChemSpider Science Commons Symposium Pacific Northwest #scspn

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider - Does Community Engagement work to Build a Quality Online Resourc...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

A Presentation At Nature Publishing Group Crowdsourcing, Collaborations And T...

guest01a117

Taming The Wild West Of Internet Based Chemistry You Can Help

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Connecting Chemists to the Internet Through ChemSpider

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider Presentation At University Of Toronto

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

RSC ChemSpider is the online chemistry database where community contributions...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Whitney Symposium Lecture June 2008

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider and How The Wisdom Of The Crowds Can Improve The Quality Of ...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Digital Chemical Representations

NextMove Software

ChemSpider as a Foundation for Crowdsourcing and Collaborations in Open Chemi...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Web Crawling Chemistry

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

RSC ChemSpider – Building An Internet Based Community For Chemists

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Building A Community Resource For The Life Sciences

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

A Presentation at Nature Publishing Group Crowdsourcing, Collaborations and T...

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

AZ of Chemspider February 2011

Royal Society of Chemistry

Similar to Navigating an Internet of Chemistry via ChemSpider (20)