Structure representations in public chemistry databases: The challenges of validating the chemical structures for 200 top-selling drugs

•Download as PPT, PDF•

0 likes•1,268 views

US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Internet-based public domain databases containing chemical compounds have grown in number, capability and content in recent years. There are now many databases containing millions of chemical compounds associated with different types of data including chemical names, properties, analytical data, and with associated mapping to proteins, assay data, clinical information and so on. These disparate data sources suffer from one common issue – quality of data. This presentation will provide an overview of our efforts to source the appropriate structural representations for 200 top-selling drugs from public domain sources. This intra- and inter-laboratory comparison of approaches, processes and necessary agreements exposed the challenges associated with aggregating structure-based data. The project also provided data regarding the distribution of quality issues associated with many of the community’s popular databases.

Technology Education

Structure representations in public chemistry databases: The challenges of validating the chemical structures for 200 top-selling drugs Antony Williams ACS Denver September 2011

Upfront Acknowledgment - All Authors… ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Internet-Based Chemistry ,[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

What needs to happen? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Top 200 Drugs on Wikipedia http://en.wikipedia.org/wiki/List_of_bestselling_drugs

The Project Challenge PART ONE ,[object Object],[object Object],[object Object],[object Object]

The Project Challenge PART TWO ,[object Object],[object Object],[object Object],[object Object]

200 Top-Selling Drugs (2006) ,[object Object],[object Object],[object Object],[object Object],[object Object]

Different Approaches ,[object Object],[object Object],[object Object],[object Object]

Observations ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Collaboration on Curation ,[object Object]

Conclusions ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Thank you Email: williamsa@rsc.org Twitter: ChemConnector Blog: www.chemspider.com/blog Personal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams

What's hot

The influence of data curation on QSAR Modeling – examining issues of qualit...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

BigDataEurope - Big Data & HealthBigData_Europe

2011-11-28 Open PHACTS at RSC CICAGopen_phacts

Structure Identification Using High Resolution Mass Spectrometry Data and the...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider - Building a Crowdsourced Chemical Database for the Chemistry Comm...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Structure Identification Using High Resolution Mass Spectrometry Data and the...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

FAIR Data and Model Management for Systems Biology(and SOPs too!)Carole Goble

RSC ChemSpider Science Commons Symposium Pacific Northwest #scspnUS Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Environmental Chemistry Compound Identification Using High Resolution Mass Sp...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Big data supporting drug discovery - cautionary tales from the world of chemi...Valery Tkachenko

Adding complex expert knowledge into chemical database and transforming surfa...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

RDA Scholarly Infrastructure 2015William Gunn

Small Molecules in Big Data - Analytica MunichEmma Schymanski

Why should Journals ask fo RRIDs?Neuroscience Information Framework

ChemSpider hosting linking and curating chemistry data for the communityUS Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...Frederik van den Broek

Navigating the Complex Web of Chemistry Using ChemSpiderUS Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Towards a gold standard and regarding quality in public domain chemistry data...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Developing tools for high resolution mass spectrometry-based screening via th...Andrew McEachran

Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

What's hot (20)

The influence of data curation on QSAR Modeling – examining issues of qualit...

BigDataEurope - Big Data & Health

2011-11-28 Open PHACTS at RSC CICAG

Structure Identification Using High Resolution Mass Spectrometry Data and the...

ChemSpider - Building a Crowdsourced Chemical Database for the Chemistry Comm...

Structure Identification Using High Resolution Mass Spectrometry Data and the...

FAIR Data and Model Management for Systems Biology(and SOPs too!)

RSC ChemSpider Science Commons Symposium Pacific Northwest #scspn

Environmental Chemistry Compound Identification Using High Resolution Mass Sp...

Big data supporting drug discovery - cautionary tales from the world of chemi...

Adding complex expert knowledge into chemical database and transforming surfa...

RDA Scholarly Infrastructure 2015

Small Molecules in Big Data - Analytica Munich

Why should Journals ask fo RRIDs?

ChemSpider hosting linking and curating chemistry data for the community

UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...

Navigating the Complex Web of Chemistry Using ChemSpider

Towards a gold standard and regarding quality in public domain chemistry data...

Developing tools for high resolution mass spectrometry-based screening via th...

Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry

Similar to Structure representations in public chemistry databases: The challenges of validating the chemical structures for 200 top-selling drugs

Integrating and curating internet based chemistry resources to serve life sci...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Connecting Chemistry Across the Internet Using ChemSpiderUS Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider – A Crowdsourcing Environment for Hosting and Validating Chemistry...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Improving online chemistry one structure at a timeUS Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider Presentation At University Of TorontoUS Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider – An Online Database and Registration System Linking the WebUS Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Building A Community Resource For The Life SciencesUS Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Chem spider as a chemical term resolverUS Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider as a chemical term resolverRoyal Society of Chemistry

RSC ChemSpider – Building An Internet Based Community For ChemistsUS Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Chemistry Online and The vision and challenges associated with building the c...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider and How The Wisdom Of The Crowds Can Improve The Quality Of ...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Using Cheminformatics Approaches to Develop a Structure Searchable Database o...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

Chemspider hosting linking and curating chemistry data for the communityRoyal Society of Chemistry

Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...Chanin Nantasenamat

Sourcing high quality online data resources for computational toxicologyUS Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure

EUGM15 - George Papadatos, Mark Davies, Nathan Dedman (EMBL-EBI): SureChEMBL:...ChemAxon

Similar to Structure representations in public chemistry databases: The challenges of validating the chemical structures for 200 top-selling drugs (20)

Integrating and curating internet based chemistry resources to serve life sci...

Connecting Chemistry Across the Internet Using ChemSpider

ChemSpider – A Crowdsourcing Environment for Hosting and Validating Chemistry...

Improving online chemistry one structure at a time

ChemSpider Presentation At University Of Toronto

ChemSpider – An Online Database and Registration System Linking the Web

RSC ChemSpider -- Managing and Integrating Chemistry on the Internet to Build...

Building A Community Resource For The Life Sciences

Chem spider as a chemical term resolver

ChemSpider as a chemical term resolver

RSC ChemSpider – Building An Internet Based Community For Chemists

Delivering Curated Chemistry to the World via Crowdsourced Deposition and Ann...

Chemistry Online and The vision and challenges associated with building the c...

ChemSpider and How The Wisdom Of The Crowds Can Improve The Quality Of ...

Using Cheminformatics Approaches to Develop a Structure Searchable Database o...

ChemSpider – A Community Platform for Chemistry and Resources Supporting the ...

Chemspider hosting linking and curating chemistry data for the community

Computational Drug Discovery: Machine Learning for Making Sense of Big Data i...

Sourcing high quality online data resources for computational toxicology

EUGM15 - George Papadatos, Mark Davies, Nathan Dedman (EMBL-EBI): SureChEMBL:...

Recently uploaded

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix

08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Vulnerability_Management_GRC_by Sohang Sengupta.pptxnull - The Open Security Community

GenCyber Cyber Security Day PresentationMichael W. Hawkins

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community

The transition to renewables in India.pdfCompetition Advisory Services (India) LLP

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Understanding the Laravel MVC ArchitecturePixlogix Infotech

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106

Pigging Solutions in Pet Food ManufacturingPigging Solutions

Key Features Of Token Development (1).pptxLBM Solutions

Recently uploaded (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Benefits Of Flutter Compared To Other Frameworks

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Swan(sea) Song – personal research during my six years at Swansea ... and bey...

08448380779 Call Girls In Friends Colony Women Seeking Men

Human Factors of XR: Using Human Factors to Design XR Systems

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Vulnerability_Management_GRC_by Sohang Sengupta.pptx

GenCyber Cyber Security Day Presentation

Presentation on how to chat with PDF using ChatGPT code interpreter

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx

The transition to renewables in India.pdf

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024

Breaking the Kubernetes Kill Chain: Host Path Mount

Understanding the Laravel MVC Architecture

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

Pigging Solutions in Pet Food Manufacturing

Key Features Of Token Development (1).pptx