Be the first to like this
Internet-based public domain databases containing chemical compounds have grown in number, capability and content in recent years. There are now many databases containing millions of chemical compounds associated with different types of data including chemical names, properties, analytical data, and with associated mapping to proteins, assay data, clinical information and so on. These disparate data sources suffer from one common issue – quality of data. This presentation will provide an overview of our efforts to source the appropriate structural representations for 200 top-selling drugs from public domain sources. This intra- and inter-laboratory comparison of approaches, processes and necessary agreements exposed the challenges associated with aggregating structure-based data. The project also provided data regarding the distribution of quality issues associated with many of the community’s popular databases.