“ Bio Space” Chemical Libraries: Perspectives on Rapidly Designing and Identifying Drug Molecules Suhaib M. Siddiqi
What is PharmaInformatics? Integration of Data from: Chemistry Biology Genomics & Proteomics Computational Chemistry (QSAR, QSPR, Structure Based Drug Design, Flexible DB searching, and ComiChem etc) For rapidly designing and optimizing “Drug-Candidates.”
Orphan Receptors, Enzymes, and Proteins as Disease Targets… Validation Issues… One of the major challenges facing the pharmaceutical industry is the validation of the orphan Receptors and Enzymes etc., discovered through Human Genome and Proteomics Projects as drug targets and the identification of selective ligands as the blockbuster pharmaceuticals of the future.
Current Drug Discovery Trends… More… Cheaper… Faster… “ Better"…
How do we accomplish the goal of “More… Cheaper… Faster… and Better Drug Candidates”? Efficient utilization of Computational Chemistry technology by integrating R&D data from various departments (e.g. Chem., Biology, Pre-clinical, Genomics and Proteomics etc)  A unique combination of Combinatorial Chemistry, Biology and Structure Based Drug Design to design “Bio Space” Combinatorial libraries with “Drug-Like” features.
High Valued New Chemical Entities Approx. 40% drugs in clinical trials are discarded because they do not show the correct adsorption, distribution, metabolism, and excretion properties (ADME) [Drug Discovery Today 1997, 2, 436]. Results?   Loss of hundreds of Millions of dollars.
Scaffolds with correct ADME… Design scaffold from existing Drugs “building-blocks” with correct adsorption, distribution, metabolism, and excretion properties (ADME). Results?  Lesser chances of failure in clinical trials due to incorrect ADME.
Design of Libraries with “Drug-like” features… Performs a retro-synthetic analysis of the small molecules in the “MDDR” and “CMC” Drug databases (~150K drug candidates) and determine which fragments occur repeatedly in drug candidates.   Build a database with building blocks include many of these fragments that were not previously commercially available.
Use ADME calculation to eliminate non-drug-like building blocks from library design. Optimize three-dimensional coordinates for library scaffolds. optimal virtual library - eliminate overlap in design and space of compounds based on different core structures
Library Comparison “ Bio-Space Chemical Library” Filling “Void” with “Bio-Space” library will lead to more potential hits
Technical Advantage… “ Medicinal Chemistry-wise and Pharmacokinetics-wise” to yield  meaningful  “Hits”… Hits found through “Bio-Space” libraries will be more “inherently” meaningful because all the “Hits” will be “Drug-Like.”
Computational Chemistry Methodologies
Rapid Dual Filtration of “Bio-Space” Chemical Libraries!
Genomics and Proteomics The Integration of Genomics or Proteomics into a drug discovery program enhances the target selection process.  Early access to this data provides a competitive advantage.  The informatics system should track data, annotations, and decisions made at this early stage to enable future analysis of the selection process.  Links to sequence, structure (if available) and other data should be provided.  The storage of images related to this data may also be desirable.
Integration of Data and Images Pharmaceutical/Biotech R&D involves the effective integration of a wide variety of data.  In addition to more traditional chemical and biological data (both “HTS” and “secondary”), genomics sequences and annotations, target protein structures, images derived from proteomics and pre-clinical analysis must be readily available for review to support timely decision making, both by management and by the scientists working directly with the data.
Conventional SAR Approach… Conventional SAR approaches establish relationships between the structure of a compound and the activity.  Links to proteomics and tissue information is missing in conventional SAR.
Image Informatics SAR Approach—ISAR Image informatics provides a new source of information a researcher can readily utilize to gain insight into experimental results.  Retrieval of database images that are associated with compounds or assay results of interest. Associating images from experiments with structures and activity data will allow researchers to understand better the biological effects of those structures.
Image-Enhanced SAR Tables View Tissue Samples Search for features in tissues Effects of Structure on Expression   Images from SciMagix
Explore commonality and differences in protein expression… Extract, analyze and mine protein image-data from 2D  electrophoresis gels.  The proteomics scientists can query entire collection of  gel experiments to find similar "protein signatures."
 
Library Design Strategies Modify “known” organic molecules. (Too resource intensive) Use of Chemical “data mining” (e.g. docking) strategies to identify potential “lead” compounds from “available” or “virtual” libraries. (Not Preferable – Expensive and time intensive) Use of known protein or antibody structure to design scaffolds and target compounds.
Library Design and Chemo-informatics Integration Issues Target Selection (Genomics/Proteomics Analysis) Creation/Acquisition of Diverse and Focused Libraries with Efficient Reagent Utilization and Reaction Optimization.  “ Scientist-Friendly” Integration of Robotics Synthesis/Acquisition of “Drug-Like” Molecules Analytical and Chemical Data Storage, Retrieval, & Analysis Biological Data Analysis Association of Genomics, Chemical, Biological & Modeling Data
Diversity Estimation Several applications of Diversity Estimation… Selection of a diverse or focused subset of compounds from a “real” or “virtual” library Comparison of a proposed library to the corporate library or to commercially available compounds Selection of “nearest neighbors” to an identified “hit” or lead compound Selection of reagents using a  reactant-biased, product-biased  strategy  (Pearlman & Smith UT Austin)
Disease Targets through “Bio-Space” Libraries Diseases for which well known targets exists (i.e.) Cancer CNS Diabetes Rapidly Identify New Drug Candidates for Orphan Receptors from Genomics and Proteomics projects.
Data Mining / HTS Screening
Summary… Leads Comp. Chem. Chem. F. Lib. HTS Opt. Drug Comp. Chem. Clinical Pre-Clinic. Biology Chem. Hits Comp. Chem. Library HTS
Conclusions Successful Drug Discovery… Robust Informatics Effective Utilization of Genomics and Proteomics Data  Consideration of Diversity and “Drug-like” Quality of Scaffolds  Effective Application of HT Synthesis & HTS HT-QSAR, QSPR, Predictive ADMET, and Bioavailability calculations DB mining Teamwork, Cooperation, and Information Sharing

Biospace Libraries

  • 1.
    “ Bio Space”Chemical Libraries: Perspectives on Rapidly Designing and Identifying Drug Molecules Suhaib M. Siddiqi
  • 2.
    What is PharmaInformatics?Integration of Data from: Chemistry Biology Genomics & Proteomics Computational Chemistry (QSAR, QSPR, Structure Based Drug Design, Flexible DB searching, and ComiChem etc) For rapidly designing and optimizing “Drug-Candidates.”
  • 3.
    Orphan Receptors, Enzymes,and Proteins as Disease Targets… Validation Issues… One of the major challenges facing the pharmaceutical industry is the validation of the orphan Receptors and Enzymes etc., discovered through Human Genome and Proteomics Projects as drug targets and the identification of selective ligands as the blockbuster pharmaceuticals of the future.
  • 4.
    Current Drug DiscoveryTrends… More… Cheaper… Faster… “ Better"…
  • 5.
    How do weaccomplish the goal of “More… Cheaper… Faster… and Better Drug Candidates”? Efficient utilization of Computational Chemistry technology by integrating R&D data from various departments (e.g. Chem., Biology, Pre-clinical, Genomics and Proteomics etc) A unique combination of Combinatorial Chemistry, Biology and Structure Based Drug Design to design “Bio Space” Combinatorial libraries with “Drug-Like” features.
  • 6.
    High Valued NewChemical Entities Approx. 40% drugs in clinical trials are discarded because they do not show the correct adsorption, distribution, metabolism, and excretion properties (ADME) [Drug Discovery Today 1997, 2, 436]. Results? Loss of hundreds of Millions of dollars.
  • 7.
    Scaffolds with correctADME… Design scaffold from existing Drugs “building-blocks” with correct adsorption, distribution, metabolism, and excretion properties (ADME). Results? Lesser chances of failure in clinical trials due to incorrect ADME.
  • 8.
    Design of Librarieswith “Drug-like” features… Performs a retro-synthetic analysis of the small molecules in the “MDDR” and “CMC” Drug databases (~150K drug candidates) and determine which fragments occur repeatedly in drug candidates. Build a database with building blocks include many of these fragments that were not previously commercially available.
  • 9.
    Use ADME calculationto eliminate non-drug-like building blocks from library design. Optimize three-dimensional coordinates for library scaffolds. optimal virtual library - eliminate overlap in design and space of compounds based on different core structures
  • 10.
    Library Comparison “Bio-Space Chemical Library” Filling “Void” with “Bio-Space” library will lead to more potential hits
  • 11.
    Technical Advantage… “Medicinal Chemistry-wise and Pharmacokinetics-wise” to yield meaningful “Hits”… Hits found through “Bio-Space” libraries will be more “inherently” meaningful because all the “Hits” will be “Drug-Like.”
  • 12.
  • 13.
    Rapid Dual Filtrationof “Bio-Space” Chemical Libraries!
  • 14.
    Genomics and ProteomicsThe Integration of Genomics or Proteomics into a drug discovery program enhances the target selection process. Early access to this data provides a competitive advantage. The informatics system should track data, annotations, and decisions made at this early stage to enable future analysis of the selection process. Links to sequence, structure (if available) and other data should be provided. The storage of images related to this data may also be desirable.
  • 15.
    Integration of Dataand Images Pharmaceutical/Biotech R&D involves the effective integration of a wide variety of data. In addition to more traditional chemical and biological data (both “HTS” and “secondary”), genomics sequences and annotations, target protein structures, images derived from proteomics and pre-clinical analysis must be readily available for review to support timely decision making, both by management and by the scientists working directly with the data.
  • 16.
    Conventional SAR Approach…Conventional SAR approaches establish relationships between the structure of a compound and the activity. Links to proteomics and tissue information is missing in conventional SAR.
  • 17.
    Image Informatics SARApproach—ISAR Image informatics provides a new source of information a researcher can readily utilize to gain insight into experimental results. Retrieval of database images that are associated with compounds or assay results of interest. Associating images from experiments with structures and activity data will allow researchers to understand better the biological effects of those structures.
  • 18.
    Image-Enhanced SAR TablesView Tissue Samples Search for features in tissues Effects of Structure on Expression Images from SciMagix
  • 19.
    Explore commonality anddifferences in protein expression… Extract, analyze and mine protein image-data from 2D electrophoresis gels. The proteomics scientists can query entire collection of gel experiments to find similar "protein signatures."
  • 20.
  • 21.
    Library Design StrategiesModify “known” organic molecules. (Too resource intensive) Use of Chemical “data mining” (e.g. docking) strategies to identify potential “lead” compounds from “available” or “virtual” libraries. (Not Preferable – Expensive and time intensive) Use of known protein or antibody structure to design scaffolds and target compounds.
  • 22.
    Library Design andChemo-informatics Integration Issues Target Selection (Genomics/Proteomics Analysis) Creation/Acquisition of Diverse and Focused Libraries with Efficient Reagent Utilization and Reaction Optimization. “ Scientist-Friendly” Integration of Robotics Synthesis/Acquisition of “Drug-Like” Molecules Analytical and Chemical Data Storage, Retrieval, & Analysis Biological Data Analysis Association of Genomics, Chemical, Biological & Modeling Data
  • 23.
    Diversity Estimation Severalapplications of Diversity Estimation… Selection of a diverse or focused subset of compounds from a “real” or “virtual” library Comparison of a proposed library to the corporate library or to commercially available compounds Selection of “nearest neighbors” to an identified “hit” or lead compound Selection of reagents using a reactant-biased, product-biased strategy (Pearlman & Smith UT Austin)
  • 24.
    Disease Targets through“Bio-Space” Libraries Diseases for which well known targets exists (i.e.) Cancer CNS Diabetes Rapidly Identify New Drug Candidates for Orphan Receptors from Genomics and Proteomics projects.
  • 25.
    Data Mining /HTS Screening
  • 26.
    Summary… Leads Comp.Chem. Chem. F. Lib. HTS Opt. Drug Comp. Chem. Clinical Pre-Clinic. Biology Chem. Hits Comp. Chem. Library HTS
  • 27.
    Conclusions Successful DrugDiscovery… Robust Informatics Effective Utilization of Genomics and Proteomics Data Consideration of Diversity and “Drug-like” Quality of Scaffolds Effective Application of HT Synthesis & HTS HT-QSAR, QSPR, Predictive ADMET, and Bioavailability calculations DB mining Teamwork, Cooperation, and Information Sharing