General Concepts in QSAR for Using the  QSAR Application Toolbox
Part 1General Concepts in QSAR
Online Course OutlineThe need for predictive methodsBasic terminology in QSAR developmentSelecting biological endpoints for modelingUsing trends to define chemical categoriesChemical categories for filling data gapsOverview of the QSAR Toolbox
Need for Predictive MethodsLaboratory measurements of chemical toxicity must address many different hazards and responses (biological effects) under many exposure scenariosRegulatory risk (or safety) assessments rely heavily on the interpretation of bioassays, which are designed to reveal the spectrum of effects of a chemicalMost assessments rely on batteries of bioassays intended to characterize important hazards such as short-term effects, carcinogenicity, mutagenicity, reproductive impairment and development deficitsScreening-level assessments can cost from $1-5M, while comprehensive risk assessments can cost more than $60M in testing and analysis
Need for Predictive MethodsDue to the high cost of animal tests, risk assessments based on such tests are limited to a small percentage of industrial chemicals Fewer than 10,000 chemicals have been tested for the major hazards; the majority of these chemicals have been tested for only a few hazardsThe world inventory of chemicals in commerce exceeds 160,000 chemicals and is growing by more than 3,000 new chemicals each year The collective capacity of all OECD member countries to conduct the SIDS initial hazard assessments was ~500 chemicals/year for the last decade
Need for Predictive MethodsAlternative test methods that are more diagnostic and faster are one leading approach to fill the data gapsNon-testing alternative methods involve the use of chemical models to extrapolate the hazards of tested chemicals to similar untested chemicalsThe non-testing alternative method includes the use of quantitative structure-activity relationships (QSARs) that relate biological activity to structurePhysical chemists, engineers and medicinal chemists have been reliably estimating the behavior of untested chemicals for more than 60 years
Basic Definitions in QSARChemistry is based on the simple premise that similar chemicals will behave similarlyIf two chemicals appear to be very similar but behave dramatically differently (e.g., stereoisomers), one’s perception of similarity is wrongLike most complex systems, the behavior of a chemical as a molecular system  is largely derived from the electronic and steric properties of its structureTherefore, the field of QSAR research is concerned with methods for quantifying chemical similarity in order to improve ways of grouping similar chemicalsSimilarity is not an absolute, but must be determined within a specific context for a specific attribute or behavior
Basic Definitions in QSARFor toxicology, structure-activity relationships start with selecting a test endpoint such as lethality (LC50) or effect concentration (EC50) QSAR searches for relationships between chemical structure and activity so that the test endpoint can be predicted accurately from structureFor example, industrial chemicals are classified as inhalation hazards when the 4-hour LC50 of a chemical for rats is less than 20 mg/l  When LC50 values are compiled for 20 - 30 chemicals and chemical structure is represented by the vapor pressure (VP),  a QSAR model can be formedIn this example, the QSAR is log LC50 (rat, 4hr) = 0.69 log VP + 1.54, which allows the LC50 of untested similar chemicals to be estimated
Selecting Biological EndpointsQSARs can be used to estimate important toxicity endpoints for thousands of chemical structures in order to focus assessments on the greatest risksHowever, a single QSAR model for a toxicity endpoint like LC50 is only reliable for chemicals that are similar to the training set of chemicalsIn toxicology, similar chemicals are usually defined as those that cause toxic effects through the same toxicity mechanismsTherefore, QSAR models must first predict whether a chemical has the same toxicity mechanism for which a particular model was builtIf a chemical’s toxicity mechanisms differs from the one for which a particular model was built, it is, by definition, not similar and its effects  cannot be estimated reliably with that same QSAR model
Steps to creating QSAR ModelsChoose a well-defined endpoint for biological activity that is relevant to the assessment Compile measured values of the biological endpoint using a consistent test method for similar chemicals (training set) –OR-Select a homologous series of relevant chemicals and systematically test all of them for the biological endpoint using a consistent method Identify the chemical attributes that are likely to be important in toxicity mechanisms and the endpoint, and then calculate for each chemical the “molecular descriptors” ( e.g., VP, Log P, pKa, etc.) that put those attributes in numerical termsExplore the statistical variances among the molecular descriptors and endpoint values, and identify relationships between the molecular descriptor and endpoint for the assessment
Simple Example for QSARCompile data for lethality (LC50) in mice from 30-minute inhalation exposures from the literatureIn this example, restricting chemicals to simple aliphatic ethers increases the likelihood that the toxicity mechanism for lethality will be the sameAs shown on the next slide, estimate or measure the vapor pressure to be used as a molecular descriptor (selected from theory or by trial and error)Correlate LC50 values with the vapor pressure to get:log LC50 = 0.57 x log VP + 2.08This regression equation is the QSAR for this endpoint and this class of chemicals even though the toxicity mechanism is not known
Simple Example for QSARNotice the dependence on VP (slope) is almost the same as the QSAR derived from the 4-hour exposure with rats shown earlier, suggesting the same structural attributes are controlling toxicityNotice the intercept is about 0.5 log units greater for the 30-minute test with  mice versus the 4-hour test with ratsIf we assume the toxicity mechanism causing lethality is the same for chemicals in both sets, can you explain why the LC50 in mice is greater (lower toxicity) than the LC50 in rats?Statistical exploration of data compiled for different chemicals is one of several important methods for defining chemical similarity
Simple Example for QSARThis QSAR implies that the vapor pressure of a chemical is an important factor in determining that chemical’s potency in a lethality test Many other molecular descriptors would not correlate to toxicity, and the good correlation here points to structural attributes that influence VPChemicals that cause lethality by other toxicity mechanisms, chemicals such as acrolein or phosgene, will appear as statistical outliersTherefore, in QSAR “outlier analysis” is often used to gain insight into chemical similarity as defined in terms of common mechanisms

General Concepts in QSAR for Using the QSAR Application Toolbox Part 1

  • 1.
    General Concepts inQSAR for Using the QSAR Application Toolbox
  • 2.
  • 3.
    Online Course OutlineTheneed for predictive methodsBasic terminology in QSAR developmentSelecting biological endpoints for modelingUsing trends to define chemical categoriesChemical categories for filling data gapsOverview of the QSAR Toolbox
  • 4.
    Need for PredictiveMethodsLaboratory measurements of chemical toxicity must address many different hazards and responses (biological effects) under many exposure scenariosRegulatory risk (or safety) assessments rely heavily on the interpretation of bioassays, which are designed to reveal the spectrum of effects of a chemicalMost assessments rely on batteries of bioassays intended to characterize important hazards such as short-term effects, carcinogenicity, mutagenicity, reproductive impairment and development deficitsScreening-level assessments can cost from $1-5M, while comprehensive risk assessments can cost more than $60M in testing and analysis
  • 5.
    Need for PredictiveMethodsDue to the high cost of animal tests, risk assessments based on such tests are limited to a small percentage of industrial chemicals Fewer than 10,000 chemicals have been tested for the major hazards; the majority of these chemicals have been tested for only a few hazardsThe world inventory of chemicals in commerce exceeds 160,000 chemicals and is growing by more than 3,000 new chemicals each year The collective capacity of all OECD member countries to conduct the SIDS initial hazard assessments was ~500 chemicals/year for the last decade
  • 6.
    Need for PredictiveMethodsAlternative test methods that are more diagnostic and faster are one leading approach to fill the data gapsNon-testing alternative methods involve the use of chemical models to extrapolate the hazards of tested chemicals to similar untested chemicalsThe non-testing alternative method includes the use of quantitative structure-activity relationships (QSARs) that relate biological activity to structurePhysical chemists, engineers and medicinal chemists have been reliably estimating the behavior of untested chemicals for more than 60 years
  • 7.
    Basic Definitions inQSARChemistry is based on the simple premise that similar chemicals will behave similarlyIf two chemicals appear to be very similar but behave dramatically differently (e.g., stereoisomers), one’s perception of similarity is wrongLike most complex systems, the behavior of a chemical as a molecular system is largely derived from the electronic and steric properties of its structureTherefore, the field of QSAR research is concerned with methods for quantifying chemical similarity in order to improve ways of grouping similar chemicalsSimilarity is not an absolute, but must be determined within a specific context for a specific attribute or behavior
  • 8.
    Basic Definitions inQSARFor toxicology, structure-activity relationships start with selecting a test endpoint such as lethality (LC50) or effect concentration (EC50) QSAR searches for relationships between chemical structure and activity so that the test endpoint can be predicted accurately from structureFor example, industrial chemicals are classified as inhalation hazards when the 4-hour LC50 of a chemical for rats is less than 20 mg/l When LC50 values are compiled for 20 - 30 chemicals and chemical structure is represented by the vapor pressure (VP), a QSAR model can be formedIn this example, the QSAR is log LC50 (rat, 4hr) = 0.69 log VP + 1.54, which allows the LC50 of untested similar chemicals to be estimated
  • 9.
    Selecting Biological EndpointsQSARscan be used to estimate important toxicity endpoints for thousands of chemical structures in order to focus assessments on the greatest risksHowever, a single QSAR model for a toxicity endpoint like LC50 is only reliable for chemicals that are similar to the training set of chemicalsIn toxicology, similar chemicals are usually defined as those that cause toxic effects through the same toxicity mechanismsTherefore, QSAR models must first predict whether a chemical has the same toxicity mechanism for which a particular model was builtIf a chemical’s toxicity mechanisms differs from the one for which a particular model was built, it is, by definition, not similar and its effects cannot be estimated reliably with that same QSAR model
  • 10.
    Steps to creatingQSAR ModelsChoose a well-defined endpoint for biological activity that is relevant to the assessment Compile measured values of the biological endpoint using a consistent test method for similar chemicals (training set) –OR-Select a homologous series of relevant chemicals and systematically test all of them for the biological endpoint using a consistent method Identify the chemical attributes that are likely to be important in toxicity mechanisms and the endpoint, and then calculate for each chemical the “molecular descriptors” ( e.g., VP, Log P, pKa, etc.) that put those attributes in numerical termsExplore the statistical variances among the molecular descriptors and endpoint values, and identify relationships between the molecular descriptor and endpoint for the assessment
  • 11.
    Simple Example forQSARCompile data for lethality (LC50) in mice from 30-minute inhalation exposures from the literatureIn this example, restricting chemicals to simple aliphatic ethers increases the likelihood that the toxicity mechanism for lethality will be the sameAs shown on the next slide, estimate or measure the vapor pressure to be used as a molecular descriptor (selected from theory or by trial and error)Correlate LC50 values with the vapor pressure to get:log LC50 = 0.57 x log VP + 2.08This regression equation is the QSAR for this endpoint and this class of chemicals even though the toxicity mechanism is not known
  • 13.
    Simple Example forQSARNotice the dependence on VP (slope) is almost the same as the QSAR derived from the 4-hour exposure with rats shown earlier, suggesting the same structural attributes are controlling toxicityNotice the intercept is about 0.5 log units greater for the 30-minute test with mice versus the 4-hour test with ratsIf we assume the toxicity mechanism causing lethality is the same for chemicals in both sets, can you explain why the LC50 in mice is greater (lower toxicity) than the LC50 in rats?Statistical exploration of data compiled for different chemicals is one of several important methods for defining chemical similarity
  • 14.
    Simple Example forQSARThis QSAR implies that the vapor pressure of a chemical is an important factor in determining that chemical’s potency in a lethality test Many other molecular descriptors would not correlate to toxicity, and the good correlation here points to structural attributes that influence VPChemicals that cause lethality by other toxicity mechanisms, chemicals such as acrolein or phosgene, will appear as statistical outliersTherefore, in QSAR “outlier analysis” is often used to gain insight into chemical similarity as defined in terms of common mechanisms