This document describes a label-free quantitative proteomics method using liquid chromatography coupled to mass spectrometry (LC/MS). The method relies on comparing changes in peptide signal responses and retention times (accurate mass retention time or AMRT components) between control and experimental samples to determine relative protein abundance changes. The method was tested by spiking increasing amounts of standard proteins into human serum samples and observing a linear relationship between signal response and protein concentration. The quantitative proteomics strategy provides a simple LC/MS-based method for comparing protein profiles between samples without using stable isotope labeling.
QSAR studies of some anilinoquinolines for their antitumor activity as EGFR i...IOSR Journals
Quantitative Structure-Activity Relationship studies has been performed on some anilinoquinolines . A variety of parameters including 2D- autocorelation, RDF, 3D- MoRSE, WHIM and GETAWAY parameters have been chosen for modeling the antitumor activity of these compounds. The multiple regression analysis reveals that the seven –parametric model is the best for modeling the activity of the compounds under present study. This model has been tested by using cross validated parameters. The results are also discussed on the basis of ridge regression.
1) The study analyzed epigenetic variation in shoots from a 1000-year old clone of the seagrass Zostera marina in the Baltic Sea.
2) While all 34 shoots sampled along a 250m transect were genetically identical based on microsatellite analysis, they showed epigenetic differences in cytosine methylation patterns.
3) Epigenetic variation between shoots was independent of their distance from shore and not correlated with geographic distance, suggesting epigenetic variation is not spatially structured within this clonal meadow.
This study examined the binding interactions between a natural MUC1 peptide epitope (GVTSAPD) and 6 derivative epitopes with a MUC1 monoclonal antibody (mAb 6A4) using Saturation Transfer Difference (STD) NMR spectroscopy. The results showed that the proline residue at position 6 of the natural epitope is critical for antibody binding. Substituting this proline for aromatic residues like tryptophan and phenylalanine maintained some binding ability, while substitution with aliphatic or polar residues eliminated binding. These findings help identify residues important for antibody recognition and could inform the design of cancer vaccine candidates.
Nc state lecture v2 Computational ToxicologySean Ekins
The document discusses computational approaches to modeling various aspects of toxicology, including physicochemical properties, quantitative structure-activity relationships, and interactions with proteins and pathways involved in toxicity. It provides examples of modeling properties like solubility and lipophilicity, as well as targets like cytochrome P450 enzymes and the pregnane X receptor. Statistical methodologies for building predictive models are also reviewed. The future of crowdsourced drug discovery is briefly mentioned.
The document describes a study that identified distinct methylation subtypes of Barrett's esophagus (BE) and esophageal adenocarcinoma (EAC) using genome-wide DNA methylation data. Four subtypes were identified with high (HM), intermediate (IM), low (LM), or minimal (MM) levels of methylation. The subtypes showed differences in epigenetically silenced genes and frequencies of genetic alterations. Specifically, the HM subtype was enriched for ERBB2 alterations and ARID1A mutations. The findings suggest etiological and biological differences between the subtypes that could help direct clinical care.
This document describes a study that used quantitative proteomic analysis with liquid chromatography and mass spectrometry to analyze changes in protein expression in mycobacteria in response to isoniazid treatment. Key findings include:
1) Proteins encoded by the kas operon (AcpM, KasA, KasB, Accd6), which are involved in fatty acid synthesis, were significantly overexpressed in response to isoniazid treatment.
2) Proteins involved in iron metabolism and cell division were also overexpressed, suggesting a complex interplay of metabolic events leading to cell death upon isoniazid treatment.
3) The study provides insights into mycobacterial responses to chemotherapeutics
Correlation globes of the exposome 2016Chirag Patel
This document discusses developing exposome correlation globes to map associations between exposures and phenotypes. It summarizes work analyzing replicated correlations between over 250 quantitative exposures measured in NHANES participants to create a globe visualization. The analysis found that while the exposome correlations were dense, with around 3% of pair-wise correlations replicated between cohorts, the correlations were modest in absolute size. The exposome globes could help contextualize exposome-wide association studies and identify co-occurring exposures.
1) The document discusses the need for large-scale studies of environmental exposures, known as environment-wide association studies (EWAS), to discover environmental factors associated with disease and address issues with past fragmented studies of single exposures.
2) EWAS can systematically analyze multiple personal exposures simultaneously and adjust for multiple testing to identify strongest associations, which can then be validated in independent data sets.
3) However, establishing causal inferences from observational EWAS data remains challenging due to complex correlations between many environmental factors.
QSAR studies of some anilinoquinolines for their antitumor activity as EGFR i...IOSR Journals
Quantitative Structure-Activity Relationship studies has been performed on some anilinoquinolines . A variety of parameters including 2D- autocorelation, RDF, 3D- MoRSE, WHIM and GETAWAY parameters have been chosen for modeling the antitumor activity of these compounds. The multiple regression analysis reveals that the seven –parametric model is the best for modeling the activity of the compounds under present study. This model has been tested by using cross validated parameters. The results are also discussed on the basis of ridge regression.
1) The study analyzed epigenetic variation in shoots from a 1000-year old clone of the seagrass Zostera marina in the Baltic Sea.
2) While all 34 shoots sampled along a 250m transect were genetically identical based on microsatellite analysis, they showed epigenetic differences in cytosine methylation patterns.
3) Epigenetic variation between shoots was independent of their distance from shore and not correlated with geographic distance, suggesting epigenetic variation is not spatially structured within this clonal meadow.
This study examined the binding interactions between a natural MUC1 peptide epitope (GVTSAPD) and 6 derivative epitopes with a MUC1 monoclonal antibody (mAb 6A4) using Saturation Transfer Difference (STD) NMR spectroscopy. The results showed that the proline residue at position 6 of the natural epitope is critical for antibody binding. Substituting this proline for aromatic residues like tryptophan and phenylalanine maintained some binding ability, while substitution with aliphatic or polar residues eliminated binding. These findings help identify residues important for antibody recognition and could inform the design of cancer vaccine candidates.
Nc state lecture v2 Computational ToxicologySean Ekins
The document discusses computational approaches to modeling various aspects of toxicology, including physicochemical properties, quantitative structure-activity relationships, and interactions with proteins and pathways involved in toxicity. It provides examples of modeling properties like solubility and lipophilicity, as well as targets like cytochrome P450 enzymes and the pregnane X receptor. Statistical methodologies for building predictive models are also reviewed. The future of crowdsourced drug discovery is briefly mentioned.
The document describes a study that identified distinct methylation subtypes of Barrett's esophagus (BE) and esophageal adenocarcinoma (EAC) using genome-wide DNA methylation data. Four subtypes were identified with high (HM), intermediate (IM), low (LM), or minimal (MM) levels of methylation. The subtypes showed differences in epigenetically silenced genes and frequencies of genetic alterations. Specifically, the HM subtype was enriched for ERBB2 alterations and ARID1A mutations. The findings suggest etiological and biological differences between the subtypes that could help direct clinical care.
This document describes a study that used quantitative proteomic analysis with liquid chromatography and mass spectrometry to analyze changes in protein expression in mycobacteria in response to isoniazid treatment. Key findings include:
1) Proteins encoded by the kas operon (AcpM, KasA, KasB, Accd6), which are involved in fatty acid synthesis, were significantly overexpressed in response to isoniazid treatment.
2) Proteins involved in iron metabolism and cell division were also overexpressed, suggesting a complex interplay of metabolic events leading to cell death upon isoniazid treatment.
3) The study provides insights into mycobacterial responses to chemotherapeutics
Correlation globes of the exposome 2016Chirag Patel
This document discusses developing exposome correlation globes to map associations between exposures and phenotypes. It summarizes work analyzing replicated correlations between over 250 quantitative exposures measured in NHANES participants to create a globe visualization. The analysis found that while the exposome correlations were dense, with around 3% of pair-wise correlations replicated between cohorts, the correlations were modest in absolute size. The exposome globes could help contextualize exposome-wide association studies and identify co-occurring exposures.
1) The document discusses the need for large-scale studies of environmental exposures, known as environment-wide association studies (EWAS), to discover environmental factors associated with disease and address issues with past fragmented studies of single exposures.
2) EWAS can systematically analyze multiple personal exposures simultaneously and adjust for multiple testing to identify strongest associations, which can then be validated in independent data sets.
3) However, establishing causal inferences from observational EWAS data remains challenging due to complex correlations between many environmental factors.
This document describes using computational methods to identify potential drug candidates that can inhibit breast cancer metastatic beta arrestin 2 (ARRB2). Ensemble-based virtual screening and pharmacophore modeling were used to screen drug molecules from the DrugBank database and identify top candidates. The 15 molecules with best binding were further analyzed with molecular dynamics simulations. The results suggest two molecules as the best ARRB2 inhibitor candidates based on their binding affinity and stability in simulations. The study provides a framework for discovering novel ARRB2 inhibitors using integrated computational approaches.
This thesis investigated using ellipsometry to analyze ligand binding to G-protein coupled receptors (GPCRs). GPCRs are important cell surface proteins linked to many diseases. Ellipsometry is an optical technique that can quantify ligand-receptor interactions by measuring changes at a surface when polarized light is reflected off. An A549 epithelial cell line expressing the CXCR4 GPCR and its ligand CXCL12 was used. Enzyme-linked immunosorbent assays were performed to determine optimal ligand and antibody concentrations. Baseline characterization of glass slides and binding experiments with ligand and antibody were conducted using ellipsometry. Psi-delta analysis of collected data showed trends in binding.
This document summarizes a research article that proposes a new three-parameter generalized beta-Poisson dose-response model for quantitative microbial risk assessment. The model allows for the minimum number of organisms required to cause infection to be a random variable, rather than fixed at one organism as in traditional single-hit beta-Poisson models. The researchers use an approximate Bayesian computation algorithm to estimate parameters for the new model by fitting it to four experimental dose-response data sets from previous studies. The results show that while the new model may better characterize some dose-response processes, it did not significantly improve fit to three of the four data sets, possibly due to small sample sizes. The generalized model provides a way to investigate dose-response mechanisms
This document proposes using ellipsometry to investigate ligand binding to G-protein coupled receptors (GPCRs). GPCRs are integral membrane proteins responsible for many biological processes. Ligand binding to GPCRs is currently measured using cell-based assays and surface plasmon resonance, but both have limitations. Ellipsometry is an optical technique that could quantify ligand-GPCR binding by analyzing changes in the polarization of light reflected from a sample. The proposal aims to use ellipsometry to collect baseline data on substrates and cells, relate ligand binding to changes in ellipsometry measurements, and confirm binding using ELISA. A chemokine receptor and ligand, CXCR4 and CXCL12, will serve as a model system to
Topological analysis of coexpression networks in neoplastic tissues (BITS2012...Roberto Anglani
This document analyzes gene co-expression networks in normal and cancer tissues to identify genes that are differentially connected between the two conditions. It introduces a method to characterize disease genes based on statistically significant differences in a gene's degree (number of connections) between normal and cancer networks. Analysis of three cancer datasets finds subsets of differentially connected genes that are distinct from differentially expressed genes and are enriched for genes known to be related to the cancers. Differentially connected genes also alter network properties like average path length more than other gene sets when removed.
Data as research output, data as part of the scholarly recordTodd Vision
This document discusses data as a research output and part of the scholarly record. It notes that data repositories like Dryad allow researchers to archive, share, and cite underlying data associated with published research. This helps establish data as a legitimate research product and part of the permanent scholarly record.
Study on In Vitro Kinetic Characterization of Transporter-Mediated PermeabilityTorben Haagh
Permeability studies across cells or tissue are often applied to investigate for permeability being rate limiting in bioavailability. Yet transporter-mediated permeability may be studied in vitro exemplified by using E1S as probe. This is the topic of Bente Steffansen and Anne Sophie Grandvuinet exciting study in chapter 2 “In Vitro Kinetic Characterization of Transporter-Mediated Permeability” of the book “Transporters in Drug Development”. Download the excerpt here: http://bit.ly/SG_Chapter
This document summarizes recent NMR studies of protein folding and binding conducted in cells and under crowded in vitro conditions. It finds that the crowded intracellular environment can affect protein properties through weak transient interactions, known as quinary interactions, between proteins and other macromolecules. While NMR studies in these complex environments are still limited, observations show that crowding can influence protein folding pathways and dynamics. The effects of crowding are more pronounced for globular proteins, while disordered proteins tend to give higher quality NMR spectra inside cells due to their independent local motions.
CYP121 Drug Discovery (M. tuberculosis)Anthony Coyne
Fragment screening was used to target cytochrome P450 (CYP) enzymes from Mycobacterium tuberculosis. Thermal shift assays identified fragment hits against CYP121, which were validated by NMR. X-ray crystallography showed two binding modes. Fragment growing, merging, and linking yielded elaborated fragments with improved binding affinity down to 2 nM against CYP121, maintaining good ligand efficiency. Future work will further optimize compounds against CYP121 and screen other CYP enzymes to develop selective, potent inhibitors to treat tuberculosis.
Repurposing large datasets to dissect exposomic (and genomic) contributions i...Chirag Patel
This document discusses the need for a new paradigm to discover environmental influences on health and disease through high-throughput exposome-wide association studies (EWAS) in a similar manner to genome-wide association studies (GWAS) for genetic influences. It notes that while GWAS have identified many genetic variants associated with traits, they only explain a small portion of heritability, suggesting an important role for environmental factors. The document advocates for developing methods to robustly characterize exposures through the exposome and relate them to health outcomes at a large scale through EWAS. This would help discover major environmental causes of diseases and help explain the "missing heritability".
Japanese Environmental Children's Study and Data-driven EChirag Patel
The National Health and Nutrition Examination Survey (NHANES) is a major program that has collected data on the health and nutrition of adults and children in the United States since the 1960s. NHANES involves physical exams, medical tests, and interviews with over 10,000 participants every two years to represent the US population. The extensive data collected on environmental exposures, health behaviors, medical conditions, and biomarkers provides a gold standard for understanding the human exposome and discovering environmental factors associated with disease.
Mycobacterium Tuberculosis cause severe disease of lungs known as Tuberculosis. It is a major cause
of morbidity and mortality even in the emerging countries also. However, to prepare an antibiotics drug against Mycobacterium tuberculosis is a major challenge
This document summarizes research on engineering bivalent affinity ligands. Key points include:
- A combinatorial bivalent ligand library was generated that identified ligands binding non-overlapping epitopes and maintained thermostability. This library performed better than a monovalent library.
- The best ligands from the bivalent library were sequenced, with 9 found to be bivalent and 1 trivalent. Mutations outside the original library positions were found in some ligands.
- Ligand M2 was identified that bound with low nanomolar affinity and could engage in multi-epitope binding to intrinsically disordered domains of beta-catenin. M2 showed promise as a new binding scaffold.
- Dimer
Biomedical Informatics 706: Precision Medicine with exposuresChirag Patel
This document discusses the need for a more comprehensive approach to understanding disease etiology by investigating environmental exposures, or the "exposome", in addition to genetic factors. It notes that genome-wide association studies have been successful in identifying genetic risk factors, but genetics alone explains only a portion of disease risk. Large studies like the National Health and Nutrition Examination Survey collect extensive exposure and health data that could be leveraged to discover environmental risk factors through an "exposome-wide association study" approach analogous to GWAS. Characterizing both genetic and environmental contributions is crucial for advancing precision medicine.
This document summarizes research aimed at developing new inhibitors of the glycogen phosphorylase (GP) enzyme for potential treatment of type 2 diabetes. Computational models were developed to predict the activity of 21 test ligands as GP inhibitors. The models suggested ligands S1, S2, and S21 may be potent nanomolar inhibitors of GP, among the best known. These promising ligands will now be synthesized and experimentally tested for their ability to inhibit GP.
Theriot E.C., Cannone J.J., Gutell R.R., and Alverson A.J. (2009).
The limits of nuclear encoded SSU rDNA for resolving the diatom phylogeny.
European Journal of Phycology, 44(3):277-290.
1) The study analyzed the radiation survival of 533 human cancer cell lines across 26 cancer types using a high-throughput profiling platform. It found significant variation in survival both across and within lineages, on the order of 5- to 7-fold difference within lineages.
2) The profiling platform was validated against standard clonogenic survival assays, showing a high correlation between results. Sensitivity to radiation was found to have a normal distribution within most lineages studied.
3) Analyzing genomic features, the study found that higher levels of somatic copy number alterations (SCNAs) in a tumor's genome correlated with increased survival after radiation exposure, possibly by enabling more error-prone DNA repair mechanisms. Certain gene mutations and
Label-Free Quantitation and Mapping of the ErbB2 Tumor Receptor by Multiple P...AB SCIEX India
Label-Free Quantitation and Mapping of the ErbB2 Tumor Receptor by Multiple Protease Digestion with Data-Dependent (MS1) and Data-Independent (MS2) Acquisitions :
Mass spectrometry-based proteomics combined with
stable-isotope labeling or tagging is a powerful technique for large-scale quantitation and unbiased characterization of the proteome. Nonetheless, it is well known that unbiased discovery proteomics typically suffers from limited dynamic range and sampling efficiency, which can only be partially addressed by incorporating orthogonal fractionation steps.
The document summarizes information about the apolipoprotein E (APOE) gene, including its location, alleles, and association with various health conditions. It discusses how the APOE gene codes for a protein involved in lipid transport, with different alleles (e2, e3, e4) producing slightly different proteins and phenotypes. Studies have shown the e4 allele increases Alzheimer's risk and earlier onset, while the rarer e2 allele may protect against or delay Alzheimer's. The e4 and e2 alleles also impact cardiovascular disease risk. The document aims to study the distribution of APOE alleles in the population of Hotchkiss School and any differences among demographics.
Genetics of gene expression explores how genetic variation impacts transcriptome diversity and gene regulation. Studying the genetics of gene expression through expression quantitative trait locus (eQTL) mapping can identify genetic variants that influence gene expression levels. eQTL discovery depends on biological factors like cell/tissue type, development, population, and environment, as well as technological factors like sample size and genotyping/gene expression platforms. RNA sequencing has increased the resolution of eQTL mapping by allowing the detection of splicing QTLs, rare eQTLs, and allele-specific expression effects. Understanding the functional consequences of eQTLs and their relationship to disease can provide insights into disease mechanisms and improve disease risk prediction.
This document describes a study comparing data acquired from data-independent LC-MS to data acquired from data-dependent LC-MS/MS. The study analyzed mixtures of four proteins alone and with a complex E. coli protein digest. Each sample was run in triplicate by both acquisition methods. The data-independent LC-MS provided more comprehensive detection of precursor and product ions than the combined data-dependent LC-MS/MS experiments. Over 90% of masses detected by LC-MS/MS were also detected by data-independent LC-MS at the correct retention times with similar fragmentation patterns. The data-independent LC-MS was able to detect more components than the individual data-dependent LC-MS/MS experiments.
This document describes a new method for absolute quantification of proteins using LC-MS/MS. The method is based on the discovery that the average MS signal response for the three most intense tryptic peptides per mole of protein is constant, regardless of protein. Given an internal standard, this relationship can be used to calculate a universal signal response factor to determine absolute protein concentrations without using protein-specific standards. The method was shown to accurately quantify both exogenous proteins in simple mixtures and endogenous serum proteins within 15% error. It also determined stoichiometries of known protein complexes in E. coli lysates.
This document describes using computational methods to identify potential drug candidates that can inhibit breast cancer metastatic beta arrestin 2 (ARRB2). Ensemble-based virtual screening and pharmacophore modeling were used to screen drug molecules from the DrugBank database and identify top candidates. The 15 molecules with best binding were further analyzed with molecular dynamics simulations. The results suggest two molecules as the best ARRB2 inhibitor candidates based on their binding affinity and stability in simulations. The study provides a framework for discovering novel ARRB2 inhibitors using integrated computational approaches.
This thesis investigated using ellipsometry to analyze ligand binding to G-protein coupled receptors (GPCRs). GPCRs are important cell surface proteins linked to many diseases. Ellipsometry is an optical technique that can quantify ligand-receptor interactions by measuring changes at a surface when polarized light is reflected off. An A549 epithelial cell line expressing the CXCR4 GPCR and its ligand CXCL12 was used. Enzyme-linked immunosorbent assays were performed to determine optimal ligand and antibody concentrations. Baseline characterization of glass slides and binding experiments with ligand and antibody were conducted using ellipsometry. Psi-delta analysis of collected data showed trends in binding.
This document summarizes a research article that proposes a new three-parameter generalized beta-Poisson dose-response model for quantitative microbial risk assessment. The model allows for the minimum number of organisms required to cause infection to be a random variable, rather than fixed at one organism as in traditional single-hit beta-Poisson models. The researchers use an approximate Bayesian computation algorithm to estimate parameters for the new model by fitting it to four experimental dose-response data sets from previous studies. The results show that while the new model may better characterize some dose-response processes, it did not significantly improve fit to three of the four data sets, possibly due to small sample sizes. The generalized model provides a way to investigate dose-response mechanisms
This document proposes using ellipsometry to investigate ligand binding to G-protein coupled receptors (GPCRs). GPCRs are integral membrane proteins responsible for many biological processes. Ligand binding to GPCRs is currently measured using cell-based assays and surface plasmon resonance, but both have limitations. Ellipsometry is an optical technique that could quantify ligand-GPCR binding by analyzing changes in the polarization of light reflected from a sample. The proposal aims to use ellipsometry to collect baseline data on substrates and cells, relate ligand binding to changes in ellipsometry measurements, and confirm binding using ELISA. A chemokine receptor and ligand, CXCR4 and CXCL12, will serve as a model system to
Topological analysis of coexpression networks in neoplastic tissues (BITS2012...Roberto Anglani
This document analyzes gene co-expression networks in normal and cancer tissues to identify genes that are differentially connected between the two conditions. It introduces a method to characterize disease genes based on statistically significant differences in a gene's degree (number of connections) between normal and cancer networks. Analysis of three cancer datasets finds subsets of differentially connected genes that are distinct from differentially expressed genes and are enriched for genes known to be related to the cancers. Differentially connected genes also alter network properties like average path length more than other gene sets when removed.
Data as research output, data as part of the scholarly recordTodd Vision
This document discusses data as a research output and part of the scholarly record. It notes that data repositories like Dryad allow researchers to archive, share, and cite underlying data associated with published research. This helps establish data as a legitimate research product and part of the permanent scholarly record.
Study on In Vitro Kinetic Characterization of Transporter-Mediated PermeabilityTorben Haagh
Permeability studies across cells or tissue are often applied to investigate for permeability being rate limiting in bioavailability. Yet transporter-mediated permeability may be studied in vitro exemplified by using E1S as probe. This is the topic of Bente Steffansen and Anne Sophie Grandvuinet exciting study in chapter 2 “In Vitro Kinetic Characterization of Transporter-Mediated Permeability” of the book “Transporters in Drug Development”. Download the excerpt here: http://bit.ly/SG_Chapter
This document summarizes recent NMR studies of protein folding and binding conducted in cells and under crowded in vitro conditions. It finds that the crowded intracellular environment can affect protein properties through weak transient interactions, known as quinary interactions, between proteins and other macromolecules. While NMR studies in these complex environments are still limited, observations show that crowding can influence protein folding pathways and dynamics. The effects of crowding are more pronounced for globular proteins, while disordered proteins tend to give higher quality NMR spectra inside cells due to their independent local motions.
CYP121 Drug Discovery (M. tuberculosis)Anthony Coyne
Fragment screening was used to target cytochrome P450 (CYP) enzymes from Mycobacterium tuberculosis. Thermal shift assays identified fragment hits against CYP121, which were validated by NMR. X-ray crystallography showed two binding modes. Fragment growing, merging, and linking yielded elaborated fragments with improved binding affinity down to 2 nM against CYP121, maintaining good ligand efficiency. Future work will further optimize compounds against CYP121 and screen other CYP enzymes to develop selective, potent inhibitors to treat tuberculosis.
Repurposing large datasets to dissect exposomic (and genomic) contributions i...Chirag Patel
This document discusses the need for a new paradigm to discover environmental influences on health and disease through high-throughput exposome-wide association studies (EWAS) in a similar manner to genome-wide association studies (GWAS) for genetic influences. It notes that while GWAS have identified many genetic variants associated with traits, they only explain a small portion of heritability, suggesting an important role for environmental factors. The document advocates for developing methods to robustly characterize exposures through the exposome and relate them to health outcomes at a large scale through EWAS. This would help discover major environmental causes of diseases and help explain the "missing heritability".
Japanese Environmental Children's Study and Data-driven EChirag Patel
The National Health and Nutrition Examination Survey (NHANES) is a major program that has collected data on the health and nutrition of adults and children in the United States since the 1960s. NHANES involves physical exams, medical tests, and interviews with over 10,000 participants every two years to represent the US population. The extensive data collected on environmental exposures, health behaviors, medical conditions, and biomarkers provides a gold standard for understanding the human exposome and discovering environmental factors associated with disease.
Mycobacterium Tuberculosis cause severe disease of lungs known as Tuberculosis. It is a major cause
of morbidity and mortality even in the emerging countries also. However, to prepare an antibiotics drug against Mycobacterium tuberculosis is a major challenge
This document summarizes research on engineering bivalent affinity ligands. Key points include:
- A combinatorial bivalent ligand library was generated that identified ligands binding non-overlapping epitopes and maintained thermostability. This library performed better than a monovalent library.
- The best ligands from the bivalent library were sequenced, with 9 found to be bivalent and 1 trivalent. Mutations outside the original library positions were found in some ligands.
- Ligand M2 was identified that bound with low nanomolar affinity and could engage in multi-epitope binding to intrinsically disordered domains of beta-catenin. M2 showed promise as a new binding scaffold.
- Dimer
Biomedical Informatics 706: Precision Medicine with exposuresChirag Patel
This document discusses the need for a more comprehensive approach to understanding disease etiology by investigating environmental exposures, or the "exposome", in addition to genetic factors. It notes that genome-wide association studies have been successful in identifying genetic risk factors, but genetics alone explains only a portion of disease risk. Large studies like the National Health and Nutrition Examination Survey collect extensive exposure and health data that could be leveraged to discover environmental risk factors through an "exposome-wide association study" approach analogous to GWAS. Characterizing both genetic and environmental contributions is crucial for advancing precision medicine.
This document summarizes research aimed at developing new inhibitors of the glycogen phosphorylase (GP) enzyme for potential treatment of type 2 diabetes. Computational models were developed to predict the activity of 21 test ligands as GP inhibitors. The models suggested ligands S1, S2, and S21 may be potent nanomolar inhibitors of GP, among the best known. These promising ligands will now be synthesized and experimentally tested for their ability to inhibit GP.
Theriot E.C., Cannone J.J., Gutell R.R., and Alverson A.J. (2009).
The limits of nuclear encoded SSU rDNA for resolving the diatom phylogeny.
European Journal of Phycology, 44(3):277-290.
1) The study analyzed the radiation survival of 533 human cancer cell lines across 26 cancer types using a high-throughput profiling platform. It found significant variation in survival both across and within lineages, on the order of 5- to 7-fold difference within lineages.
2) The profiling platform was validated against standard clonogenic survival assays, showing a high correlation between results. Sensitivity to radiation was found to have a normal distribution within most lineages studied.
3) Analyzing genomic features, the study found that higher levels of somatic copy number alterations (SCNAs) in a tumor's genome correlated with increased survival after radiation exposure, possibly by enabling more error-prone DNA repair mechanisms. Certain gene mutations and
Label-Free Quantitation and Mapping of the ErbB2 Tumor Receptor by Multiple P...AB SCIEX India
Label-Free Quantitation and Mapping of the ErbB2 Tumor Receptor by Multiple Protease Digestion with Data-Dependent (MS1) and Data-Independent (MS2) Acquisitions :
Mass spectrometry-based proteomics combined with
stable-isotope labeling or tagging is a powerful technique for large-scale quantitation and unbiased characterization of the proteome. Nonetheless, it is well known that unbiased discovery proteomics typically suffers from limited dynamic range and sampling efficiency, which can only be partially addressed by incorporating orthogonal fractionation steps.
The document summarizes information about the apolipoprotein E (APOE) gene, including its location, alleles, and association with various health conditions. It discusses how the APOE gene codes for a protein involved in lipid transport, with different alleles (e2, e3, e4) producing slightly different proteins and phenotypes. Studies have shown the e4 allele increases Alzheimer's risk and earlier onset, while the rarer e2 allele may protect against or delay Alzheimer's. The e4 and e2 alleles also impact cardiovascular disease risk. The document aims to study the distribution of APOE alleles in the population of Hotchkiss School and any differences among demographics.
Genetics of gene expression explores how genetic variation impacts transcriptome diversity and gene regulation. Studying the genetics of gene expression through expression quantitative trait locus (eQTL) mapping can identify genetic variants that influence gene expression levels. eQTL discovery depends on biological factors like cell/tissue type, development, population, and environment, as well as technological factors like sample size and genotyping/gene expression platforms. RNA sequencing has increased the resolution of eQTL mapping by allowing the detection of splicing QTLs, rare eQTLs, and allele-specific expression effects. Understanding the functional consequences of eQTLs and their relationship to disease can provide insights into disease mechanisms and improve disease risk prediction.
This document describes a study comparing data acquired from data-independent LC-MS to data acquired from data-dependent LC-MS/MS. The study analyzed mixtures of four proteins alone and with a complex E. coli protein digest. Each sample was run in triplicate by both acquisition methods. The data-independent LC-MS provided more comprehensive detection of precursor and product ions than the combined data-dependent LC-MS/MS experiments. Over 90% of masses detected by LC-MS/MS were also detected by data-independent LC-MS at the correct retention times with similar fragmentation patterns. The data-independent LC-MS was able to detect more components than the individual data-dependent LC-MS/MS experiments.
This document describes a new method for absolute quantification of proteins using LC-MS/MS. The method is based on the discovery that the average MS signal response for the three most intense tryptic peptides per mole of protein is constant, regardless of protein. Given an internal standard, this relationship can be used to calculate a universal signal response factor to determine absolute protein concentrations without using protein-specific standards. The method was shown to accurately quantify both exogenous proteins in simple mixtures and endogenous serum proteins within 15% error. It also determined stoichiometries of known protein complexes in E. coli lysates.
This document summarizes a study that used label-free LC-MS/MS to simultaneously identify and quantify proteins in E. coli grown in different carbon sources (glucose, lactose, acetate). The methodology involved growing E. coli in different media, extracting and digesting proteins, and performing LC-MS/MS analysis using alternating low and elevated collision energies to obtain precursor and fragment ion data. Over 7,000 accurate mass measurements of precursors and 25,000 of fragments were obtained. Multiple peptides were identified for relative protein quantitation. Results showed differential expression of proteins involved in carbon utilization pathways between the growth conditions. The label-free quantitation was consistent with other omics methods.
Demonstration of alternate scanning LCMS for simultanous acquisition of precursor and product ions without precursor mass selection (ie Multiplex LCMS).
NIH's Glen Hortin discusses translating new biomarkers into clinical laboratory tests. As a proteomics pioneer, Hortin has worked on protein processing and post-translational modifications for over 20 years. He advises focusing biomarker research on clinically relevant problems and validating candidates rigorously before developing diagnostic tests. Hortin also stresses the need to standardize assays and educate clinicians on appropriate test usage.
This document summarizes a study that used label-free LC-MS to analyze changes in the proteome of Escherichia coli (E. coli) when grown in different carbon sources (glucose, lactose, acetate). E. coli is a commonly used model organism and understanding its response to environmental changes provides insights into microbial physiology. The LC-MS approach simultaneously identified and quantified proteins across conditions without isotopic labeling. Relative protein abundances ranged from 0.1- to 90-fold changes between conditions. The results correlated well with known E. coli biochemistry and previous transcriptional profiling studies. This label-free LC-MS method provides an effective way to characterize microbial proteomes.
1) Researchers used a new LC/MS technique called LCMSE to analyze the proteomes of Mycobacterium bovis bacteria that were either untreated or exposed to the tuberculosis drug isoniazid (INH).
2) They identified over 6,600 peptides and 103 proteins, with 24 proteins showing differing levels between the untreated and INH-exposed bacteria.
3) This provides new insights into how INH works and the bacterial response, and could help identify new drug targets through studying changes in the proteome over time or between drug-resistant and non-resistant strains.
This document describes a novel database search algorithm for identifying proteins from data independent acquisitions where multiple precursor ions are fragmented simultaneously. The algorithm uses an iterative process to incrementally increase selectivity, specificity, and sensitivity. It accounts for peptide retention time, ion intensities, charge states, and accurate masses of precursors and products. The algorithm was tested on simple and complex protein mixtures and validated independently, demonstrating its ability to correctly identify proteins across a wide dynamic range with high sensitivity and specificity.
An Actionable Annotation Scoring Framework For Gas Chromatography - High Reso...Nicole Heredia
1. The document proposes an actionable annotation scoring framework for gas chromatography-high resolution mass spectrometry (GC-HRMS) to standardize the reporting of confidence levels in chemical identifications.
2. The framework adapts an existing scoring schema for liquid chromatography-mass spectrometry to the evidence provided by common GC-HRMS workflows, including retention time, ionization patterns, accurate mass, isotopic patterns, and database matches.
3. Validation using spiked standards in plasma and air samples showed a 12% false positive rate for annotations assigned a confidence level of 2 when isomers are excluded, demonstrating the framework's ability to reliably communicate identification confidence.
Mel Reichman on Pool Shark’s Cues for More Efficient Drug DiscoveryJean-Claude Bradley
Mel Reichman, senior investigator and director of the LIMR Chemical Genomics Center at the Lankenau Institute for Medical Research presents at the chemistry department at Drexel University on November 12, 2009.
Modern drug discovery by high-throughput screening (HTS) begins with testing hundreds of thousands of compounds in biological assays. The confirmed hit rate for typical HTS is less than 0.5%; therefore, 99.5% of the costs of HTS are for generating null data. Orthogonal convolution of compound libraries (OCL) is 500% more efficient than present HTS practice. The OCL method combines 10 compounds per well. An advantage of this method is that each compound is represented twice in two separately arrayed pools. The potential for the approach to better enable academic centers of excellence to validate medicinally relevant biological targets is discussed.
Finland Helsinki Drug Research slides 2011Sean Ekins
This document summarizes the application and future of ADME/Tox (Absorption, Distribution, Metabolism, Excretion and Toxicology) models. It discusses how combining in silico, in vitro and in vivo data can help evaluate these properties earlier in drug discovery. It also outlines how crowdsourcing and increased data and model sharing can help advance the field. Finally, it provides examples of Bayesian machine learning models that have been developed to predict various ADME/Tox endpoints.
This document describes a study that evaluated the performance of quantitative spectral analysis tools used in metabolic profiling when applied to mixtures of biofluid samples. Three urine samples were mixed in known proportions according to an experimental design and analyzed by 1H NMR spectroscopy. Fifty-four metabolites were then quantified from the spectra using two common methods: targeted spectral fitting and targeted spectral integration. Multivariate analysis showed the mixture design was accurately recapitulated from the spectral data. A metric was calculated to assess the reliability of each metabolite measurement across the varying sample compositions. Several metabolites were found to have low reliability, largely due to spectral overlap or low signal-to-noise ratios. This strategy allows evaluation of spectral features in conditions that better represent real biological samples and
1) The document describes the design and synthesis of new (bis)ureidopropyl and (bis)thioureidopropyl diamine compounds as inhibitors of the histone demethylase LSD1.
2) Key compounds featured 3-5-3 and 3-6-3 carbon backbone architectures. Several compounds displayed single-digit micromolar IC50 values against recombinant LSD1 in vitro.
3) Compound 6d showed low micromolar cell viability IC50 values against lung and breast cancer cell lines. It also increased mRNA expression of silenced tumor suppressor genes in lung cancer cells.
Radiolytic Modification of Basic Amino Acid Residues in Peptides : Probes for...Keiji Takamoto
This document discusses using hydroxyl radical mediated protein footprinting coupled with mass spectrometry to map protein structure and examine protein-protein interactions. It specifically examines the radiolytic oxidation of histidine, lysine, and arginine residues in model peptides. Arginine was found to be very sensitive to radiolytic oxidation, producing a characteristic product. Histidine generated a mixture of oxidation products involving rupture and addition to its imidazole ring. Lysine was converted to hydroxylysine or carbonylysine. Examining the reactivity of these basic amino acids expands the utility of protein footprinting techniques.
This study examined whether physiological differences between males and females could influence responses to medical treatments using data from 40 patients. The researchers used multivariate regression, partial least squares analysis, principal component analysis, and permutation procedures to analyze 54 predictor variables, 96 response variables, and missing data from the patients. The results showed that while sex showed some influence on certain pre-treatment physiological variables, it did not significantly impact post-treatment responses or confound results. The researchers concluded that physiological differences between males and females did not appear to confound interpretations from the study.
High-throughput proteomics: from understanding data to predicting themMaté Ongenaert
High-throughput proteomics: from understanding data to predicting themprof. dr. Lennart Martens
UGent - Department of Biochemistry, Faculty of Medicine and Health Sciences, VIB - Group Leader Computational Omics and Systems Biology Group (CompOmics), Department of Medical Protein Research
In proteomics, as in any high-throughput omics field, the rate of data generation has increased dramatically, yielding very large datasets that require substantial processing to render them useful and interpretable. Key concepts here are data management, data-bound analysis algorithms, and user interface design. But we do not need to limit ourselves to only the interpretation of experimental results. By combining data from across many (unrelated) experiments, we can gain substantial knowledge about the strengths and limitations of our technological approaches. High-throughput methods however, rarely serve as the endpoint for research. As exquisite parallel hypothesis testers, these approaches can quickly highlight promising follow-up targets for more detailed study. Yet moving from discovery to targeted analysis requires much more in-depth understanding of sample and methodology, which is where the insights gained from large-scale data analysis come into play. Armed with this knowledge, we can begin to predict experimental outcomes based on specific hypotheses, thus effectively creating tests or assays that can be used in focused validation experiments
This document describes the development of gene expression signatures that can predict sensitivity to various chemotherapeutic drugs using microarray data from cancer cell lines.
1) Signatures were developed for several drugs including docetaxel, topotecan, adriamycin, etoposide, 5-fluorouracil, paclitaxel, and cyclophosphamide that could accurately predict drug sensitivity in independent cancer cell line datasets.
2) These signatures were also shown to predict clinical response to the drugs in human patients, including predicting response to docetaxel in breast cancer and ovarian cancer with over 85% accuracy.
3) The signatures were specific to each individual drug and could predict response to multid
This document summarizes research on Peptide Nucleic Acids (PNAs), which are analogs of oligodeoxynucleotides (ODNs) that can bind DNA and RNA in a sequence-specific manner. The document discusses how homopyrimidine PNAs form 2:1 stoichiometric complexes with complementary polypurine targets, whereas PNAs with mixed bases form 1:1 complexes similar to DNA. These PNA-nucleic acid complexes can inhibit gene expression by blocking enzymes involved in transcription, cDNA synthesis, and translation. The document also discusses the impact of biophysical parameters on the biological effects of PNAs as antisense inhibitors of gene expression.
This document summarizes a study examining the absorption, bioavailability, and metabolism of oral and intravenous resveratrol in human volunteers. The key findings were:
- Absorption of a 25 mg oral dose was at least 70%, with peak plasma levels of resveratrol and metabolites of 491 ng/ml reached after 1 hour. However, only trace amounts of unchanged resveratrol (<5 ng/ml) were detected in plasma.
- Most of the oral dose was recovered in urine. Metabolism via sulfate conjugation, glucuronic acid conjugation, and hydrogenation of the aliphatic double bond was identified.
- Sulfate conjugation in the intestine/
Genomic gene expression changes resulting from Trypanosomiasis: a horizontal study Examining expression changes elucidated by micro arrays in seminal tissues associated with the pathophysiology of Trypanosomiasis during disease progression
Talk at Yale University April 26th 2011: Applying Computational Modelsfor To...Sean Ekins
The document discusses applying computational models to problems in toxicology, drug discovery, and beyond. It summarizes recent work using machine learning models and other in silico techniques to predict drug-induced liver injury (DILI) and interactions with transporters like hOCTN2. Models were able to classify compounds as DILI-positive or negative with over 75% accuracy when tested on external datasets. The techniques discussed could help prioritize compounds for further testing and filter libraries to avoid reactive or toxic features.
Chemical Nose Biosensors Cancer Cells and BiomarkersOscar1Miranda2
This document describes a method using gold nanoparticle-polymer sensor arrays to detect and differentiate between normal, cancerous, and metastatic cells. The method works by measuring changes in fluorescence as polymers bound to gold nanoparticles are displaced by interactions with different cell surfaces. Testing showed the sensor arrays could rapidly distinguish (within minutes) different human cancer cell types, as well as normal versus cancerous and metastatic human breast cells. The arrays could also differentiate between isogenic normal, cancerous, and metastatic mouse epithelial cell lines. This method provides a label-free way to detect subtle phenotypic differences between healthy and diseased cell types based on their surface properties.
Chemical Nose Biosensors Cancer Cells and BiomarkersOscar1Miranda2
This document describes a method using gold nanoparticle-polymer sensor arrays to detect and differentiate between normal, cancerous, and metastatic cells. The method works by measuring changes in fluorescence as polymers bound to gold nanoparticles are displaced by interactions with different cell surfaces. Testing showed the sensor arrays could rapidly distinguish (within minutes) different human cancer cell types, as well as normal versus cancerous and metastatic human breast cells. The arrays could also differentiate between isogenic normal, cancerous, and metastatic mouse epithelial cell lines. This method provides a label-free way to detect subtle phenotypic differences between healthy and diseased cell types based on their surface properties.
The SIRM Core aims to:
1. Develop high-throughput profiling of stable isotope labeling patterns in metabolites using mass spectrometry and NMR.
2. Build atom-resolved human metabolic network models and non-steady state flux modeling capabilities.
3. Establish mechanisms of metabolic changes and biomarker regulation associated with drug responses in cardiovascular and neuropsychiatric diseases.
The Core will apply stable isotope tracers and isotopomer profiling to network projects studying gene polymorphisms and drug responses. Metabolic flux modeling and databases will help interpret data and generate testable hypotheses.
Similar to 00047 Jc Silva 2005 Anal Chem V77p2187 (20)
2. missed cleavage between 700 and 2481 molecular mass. An a quantitative proteomics strategy which employs an LC/MS
average of 7 tryptic peptides of the 105 000 are found within a method as the basis for the analytical strategy for quantifying
mass tolerance of 5 ppm of itself. If the mass tolerance is increased proteome profile data for differential expression analysis. This
to within 1 Da, the average number of tryptic peptides is increased method relies on the changes in the peptide analyte signal
to 165. Using this logic, the opportunity to have more than one response from each accurate mass measurement and correspond-
peptide eluting within a nominal mass bin can be up to 23 times ing retention time (AMRT) component, and to directly reflect their
more likely if the data are reduced from accurate mass measure- concentrations in one sample relative to another. This method
ments to nominal mass. As a result, nominal mass binning of mass does not require the use of any stable-isotope labeling method or
spectrometric, LC/MS data may lead to problems in subsequent enrichment strategy; however, it does require that the sample
clustering of replicate analyzes and to variability in the corre- preparation conditions are carefully controlled for optimal, quan-
sponding quantitative analysis. Radulovic and co-workers report titative performance. Regardless of the analytical technique, the
that their quantitative results exhibited an acceptable measure of protein samples must be prepared in a fashion that ensures an
variance of 2-fold or less deviation in the observed signal efficient and reproducible separation, with concurrent elimination
intensities. In addition to presenting data from an identical of undesirable artifacts.
instrument platform, Wang and colleagues also illustrated LC/ In this investigation, we prepared a tryptic digest of human
MS data collected on a time-of-flight mass spectrometer. In this serum spiked with increasing amounts of a standard protein
work, the authors indicated that the higher resolution and mass mixture and observed the linear behavior in the signal from
accuracy of the TOF system was found to be advantageous for digested peptides corresponding to the experimentally configured
tracking and quantifying large numbers of mass spectral peaks. protein concentrations. The methodology presented in this work
The results obtained from these studies provided acceptable maximizes the duty cycle of a quadrupole-time-of-flight (Q-TOF)
coefficients of variation (∼25%) across integrated peak intensities. mass spectrometer to yield extensive quantitative and qualitative
The data acquisition platform used by Radulovic was configured information by systematically and simultaneously analyzing the
to collect two parallel LC/MS experiments in a single LC/MS run peptide components from large sets of protein mixtures.22,23
for simultaneous quantitative and qualitative analysis. In an Although this work involves the analysis of human serum, this
alternating fashion, the instrument measures the masses of eluting methodology is applicable to any number of biological samples
peptide components in MS mode in one function and then carries (plasma, urine, whole-cell lysate, organelle, tissue, or microbial).
out a data-dependent CID for a subset of detected precursor
masses in MS/MS mode in a second function. However, the MATERIALS AND METHODS
Sample Preparation. Six aliquots of human serum (HS,
authors affirm that considerably more peptide peaks are detectable
Sigma source) were dispensed into separate eppindorf tubes
in full-scan MS mode than can be identified in the same time frame
(∼200 ug). An equimolar stock solution of exogenous proteins
using the collision-induced dissociation process. This level of
(yeast enolase and alcohol dehydrogenase, rabbit glycogen
inefficiency requires that additional MS/MS experiments would
phosphorylase, and bovine serum albumin and hemoglobin,
be needed for thorough identifications to be made in a given study.
MPDS proteins) was prepared such that each protein was present
The use of MS technology in high-throughput proteomics faces
at 5 pmol/µL in 50 mM ammonium bicarbonate (pH 8.5). The
several challenges in order to accurately compare differentially
exogenous proteins were added to each of the six aliquots of
expressed proteins from corresponding peptide component infor-
human serum such that the final concentration of equimolar
mation, such as retention time, mass, and signal response.
proteins was 0.500, 0.250, 0.100, 0.050, 0.025, and 0.010 pmol/µL
Included among these challenges, software solutions for peak
(final volume of 200 µL), respectively. To avoid working under
detection, chromatographic spectral alignment, charge-state re-
the specified limits of the pipettor, appropriate dilutions of the
duction, and deisotoping need to be implemented in order to
stock solution were made to ensure that at least 10-20 µL of stock
reduce the complexity of the continuum MS data and successfully
protein solution, from a calibrated 20-µL pipettor, was added to
compare differences among samples. The Expression Informatics
achieve the desired final exogenous protein concentration. The
software, introduced in this study, has been developed to carry
volumes of the samples were adjusted to 100 µL with 50 mM
out these functionalities for comprehensive, quantitative, dif-
ammonium bicarbonate (pH 8.5) containing 0.05% RapiGest.25
ferential expression analysis.
Protein was reduced in the presence of 10 mM dithiothreitol at
Although it has been observed that electrospray ionization
60 °C for 30 min. The protein was alkylated in the dark, in the
(ESI) provides signal responses that correlate linearly with
presence of 50 mM iodoacetamide, at room temperature for 30
increasing analyte concentration,15-17 historically, there have been
min. Proteolytic digestion was initiated by adding modified trypsin
concerns regarding nonlinearity of signal response and ion
(Promega) at a concentration of 75:1 (total protein to trypsin, by
suppression effects18-21 which have prevented the implementation
of a simple LC/MS solution for quantitative proteomics. We outline (20) Sangster, T.; Spence, M.; Sinclair, P.; Payne, R.; Smith, C. Rapid Commun.
Mass Spectrom. 2004, 18, 1361-1364.
(15) Purves, R. W.; Gabryelski, L. L. Rapid Commun. Mass Spectrom. 1998, 12, (21) Mei, H.; Hsieh, Y.; Nardo, C.; Xu, X.; Wang, S.; Ng, K.; Korfmacher, W. A.
695-700. Rapid Commun. Mass Spectrom. 2003, 17, 97-103.
(16) Voyksner, R. D.; Lee, H. Rapid Commun. Mass Spectrom. 1999, 13, 1427- (22) Bateman, R. H.; Hoyes, J. B. U.K. Patent 2,364,168A, 2002.
1437. (23) Purvine, S.; Eppel, J. T.; Yi, E. C.; Goodlett, D. R. Proteomics 2003, 3, 847-
(17) Chelius, D.; Bondarenko, P. J. Proteome Res. 2002, 1, 317-323. 850.
(18) Muller, C.; Schafer, P.; Stortzel, M.; Vogt, S.; Weinmann, W. J. Chromatogr., (24) Geromanos, S.; Dongre, A.; Opiteck, G.; Silva, J. C. U.K. Patent 2,385,918A,
B 2002, 773, 47-52. 2003.
(19) Matuszewski, B. K.; Constanzer, M. L.; Chavez-Eng, C. M. Anal. Chem. (25) Yu, Y. Q.; Gilar, M.; Lee, P. J.; Bouvier, E. S. P.; Gebler, J. C. Anal. Chem.
1998, 70, 882-889. 2003, 75, 6023-6028.
2188 Analytical Chemistry, Vol. 77, No. 7, April 1, 2005
3. weight) and incubated at 37 °C overnight. Each digestion mixture than one charge-state, the corresponding area for any given
was diluted to a final volume of 200 µL with 50 mM ammonium monoisotopic ion is reported as the summed area from all
bicarbonate (pH 8.5) to reduce the concentration of RapiGest contributing charge states. The retention time is determined for
detergent to 0.025%. The tryptic peptide solution was centrifuged each reported monoisotopic ion at the moment it reaches its
at 13 000 rpm for 10 min, and the supernatant was transferred maximum intensity (apex). Each detected component is referred
into an autosampler vial for peptide analysis via LC/MS. Each to as an AMRT (accurate-mass, retention time) component. An
sample was analyzed in triplicate. The LC/MS analysis was AMRT is extracted from the continuum data only if it exceeds a
performed using 10 µL of the final tryptic digest. user-defined, minimum detection threshold. The software is also
HPLC Configuration. Capillary liquid chromatography (Ca- capable of processing the data using an autothreshold capability
pLC) of tryptic peptides was performed with a Waters CapLC/ which automatically adjusts the ion detection threshold over time
Waters CapLC autosampler, equipped with a Waters NanoEase as a function of the dynamic range within the mass spectrometric
Atlantis C18, 300 µm × 15 cm reversed-phase column. The aqueous data. The culmination of this process produces an AMRT
mobile phase (mobile phase A) contained 1% acetonitrile in water component list. This list contains many experimentally derived
with 0.1% formic acid. The organic mobile phase (mobile phase attributes for each of the recorded AMRT components (AMRTs).
B) contained 80% acetonitrile in water with 0.1% formic acid. Included in this output are the weight-averaged monoisotopic mass
Peptides were loaded onto the column with 6% mobile phase B. and charge state, the calculated mass deviation, the deisotoped
Peptides were eluted from the column with a gradient of 6-40% and charge-state-reduced sum intensity (centered by area), the
mobile phase B over 100 min at 4.4 µL/min, followed by a 10-min chromatographic area, the calculated intensity deviation, the
rinse of 99% of mobile phase B. The column was immediately observed apex retention time (centered by area), and the observed
reequilibrated at initial conditions (6% mobile phase B) for 20 min. start and stop time for the ion detection of the corresponding
The lock mass, [Glu1]-fibrinopeptide at 100 fmol/µL (GFP), was AMRT.
delivered from the auxiliary pump of the CapLC at 1 µL/min to Clustering Peptide Components by Mass and Retention
the reference sprayer of the NanoLockSpray source. Time. One of the key operations required for the comparative
Mass Spectrometer Configuration. Mass spectrometry analy- analyses of peptide mixtures is clustering chemically identical
sis of tryptic peptides was performed using a modified Waters/ components together from replicate injections of the same sample
Micromass Q-Tof Ultima API to provide enhanced mass accuracy. as well as among multiple samples. The clustering algorithm
Detection events were acquired at 4 GHz. For all measurements, performs multiple binary comparisons to conduct the overall
the mass spectrometer was operated in V mode with a typical clustering strategy for a complete experiment.27,28 AMRT compo-
resolving power of at least 10 000. The spectrum integration time nents from each injection are clustered to align identical compo-
was 1.8 s with an interscan delay time of 0.2 s. All analyses were nents to one another on the basis of a mass precision and a
performed using positive-mode ESI using a NanoLockSpray retention time deviation threshold. In an initial binary comparison,
source. The lock mass channel was sampled every 30 s. The mass a subset of the AMRTs from two separate injections is compared
spectrometer was calibrated with a GFP solution (100 fmol/µL) to establish the experimental retention time deviation behavior
delivered through the reference sprayer of the NanoLockSpray of identical AMRTs between the two samples. The subset of
source. The doubly charged ion ([M + 2H]2+) was used for initial AMRTs considered in the initial comparison is typically those
single point calibration (Lteff), and MS/MS fragment ions of GFP above the median intensity for the entire data set. In the initial
were used to obtain the final instrument calibration. Data acquisi- comparison, a coarse threshold of typically 5 min is applied to
tion was operated in the exact neutral loss mode, without an consider all potential paired candidates. Often, peptides may not
include list. Accurate mass LC/MS and LC/MSE data were reproducibly elute at exactly the same time throughout a replicate
collected using 10 eV for MS and 28-35 eV for MSE acquisition analysis. However, one generally observes a consistent shift in
such that one cycle of MS and MSE data was acquired every 4.0 retention-time, whereby the observed retention time of a given
s. The RF offset was adjusted such that the LC/MS data were set of peptides will deviate systematically, although not necessarily
effectively acquired from m/z 300 to 2000, which ensured that by the same magnitude. Due to the complexity of the data, there
any masses observed in the LC/MSE data less than m/z 300 were often exist conditions under which an AMRT in one condition or
known to arise from dissociations in the collision cell. replicate will match within the threshold criterion to multiple
AMRTs in a different replicate or condition. This, of course, is
RESULTS AND DISCUSSION not desirable, since an AMRT from one condition or replicate
Ion Detection. The ion detection algorithm of the Expression should only match its identical companion in any other condition.
Informatics software uses a maximum likelihood algorithm to To address these situations, the clustering algorithm calculates
deisotope and charge-state-reduce the m/z detections to the the delta retention time for all matched AMRTs and plots the
corresponding monoisotopic m/z (MH+) for each scan of the retention time for each AMRT against the retention time difference
continuum LC/MS data.26 The algorithm also calculates the observed among the corresponding matched components (Figure
observed mass and intensity measurement deviation for every 1A). In doing so, the algorithm can determine the expected
detected component. The chromatographic area associated with
each component is calculated using an integration algorithm (27) Li, G.-Z.; Gorenstein, M.; Geromanos, S.; Silva, J. C.; Dorschel, C. A.; Riley
similar to the ApexTrack peak integration algorithm provided in T. Proc. 52nd ASMS Conf. Mass Spectrom. Allied Top. 2004, TPY 354,
Nashville, TN.
the MassLynx software. If a particular component exists in more (28) Gorenstein, M.; Li, G.-Z.; Geromanos, S.; Silva, J. C.; Dorschel, C. A.; Plumb,
R. S.; Stumpf, C. L.; Riley, T. Proc. 52nd ASMS Conf. Mass Spectrom. Allied
(26) Skilling, J.; Bryan, R. K. Mon. Not. R. Astron. Soc. 1984, 211, 111-124. Top. 2004, WPJ 161, Nashville, TN.
Analytical Chemistry, Vol. 77, No. 7, April 1, 2005 2189
4. Figure 1. (A) The AMRTs from two separate injections of the human serum spiked with 5 pmol of exogenous protein were clustered by mass
and retention time using the Expression Informatics software to associate identical components. The initial results of the clustering algorithm
are displayed by plotting the observed retention time deviation for all matched components versus the retention time of the first injection. Each
point represents a paired AMRT having the appropriate mass ((10 ppm) and retention time tolerance ((5.0 min) from the first pass of the
clustering algorithm. The red and blue lines define the corresponding upper and lower limits for the retention time tolerance used in the second
pass filter. The matched components outside these tolerances are examples of similar mass measurements existing at multiple retention times
within the 10 ppm mass tolerance. Although the absolute retention time deviation is ∼1.45 min throughout the entire chromatogram (min )
-1.05, max ) 0.40), the data indicate that the deviation of matched components at any given retention time does not exceed 0.5 min. (B) Using
the retention time deviations from the matched components of the raw data, within the narrow retention time tolerance of 0.5 min, the retention
times of the paired AMRTs are normalized and the redundant matched AMRTs are removed by eliminating those paired components outside
the fine retention time tolerance. (C) Mass precision measurements from the 3131 replicating AMRTs (in at least two out of three injections)
from the human serum samples containing 5.0 and 0.5 pmol exogenous proteins, whose replicate normalized intensity measurements were
below 30% Cv. The 3131 replicating AMRTs produced 13 963 individual mass measurements used to produce the histogram plot of the mass
precision. A total of 12 981 mass measurements were determined to have a mass precision of (3 ppm, which constitutes ∼93% of the data set.
(D). Coefficient of variation of the intensity measurements from the 3404 replicating AMRTs (in at least two out of three injections) from the
human serum samples containing 5.0 and 0.5 pmol exogenous proteins. The 3404 replicating AMRTs produced 5032 combined Cv measurements
from both samples and were used to produce the histogram plot of the coefficient of variation of the measured intensity. A total of 4557 of the
5032 Cv measurements were under 30%, which constitutes ∼90% of the data set. The average and median coefficient of variation from these
two data sets are 11 and 14%, respectively.
retention time deviations for a given set of peptides at any given min is generally observed among paired components between two
moment throughout the chromatogram. The expected retention experiments. Figure 1A illustrates a single pairwise comparison
time deviations are modeled by monitoring the density of points of a replicate injection of the same sample. If the chromatography
about a retention time deviation plot and determining the upper were ideal, the retention time differences for all matched com-
and lower retention time deviation boundaries for any given binary ponents would be 0, and the resulting plot would illustrate a
comparison. Only the matched AMRT component included within straight horizontal line centered at zero deviation. Each point in
the defined retention time deviation boundaries are considered the plot designates one paired set of components. Since many
to satisfy the matching criteria. Figure 1A illustrates such a plot. components elute from the column at any moment in time, the
A fine retention time deviation threshold of typically less than 0.5 resulting plot should illustrate a dense scattering of points along
2190 Analytical Chemistry, Vol. 77, No. 7, April 1, 2005
5. the retention time coordinate. Figure 1A illustrates that the measurements that are in dead time, there is a limit to its ability
reproducibility of the chromatographic peptide separation is ∼0.25 to accurately correct for those measurements.29,30 With this in
min, with an overall chromatographic deviation of 1.0 min. The mind, the internal AMRT standards selected for normalization
pairwise comparison is performed for each of the replicate were well below dead time and existed in all replicates of each
injections, as well as across the multiple experiments. The sample. The average monoisotopic masses of the AMRTs used
retention time deviations observed between the AMRTs of two for normalization were 1273.6547, 1706.7746, and 2171.1138, with
injections serve as multiple internal standards and are used to corresponding elution times of approximately 42.60, 53.60, and
determine an appropriate retention time offset for AMRTs eluting 101.80 min, respectively. These AMRT components were endog-
at any moment. The retention time offsets are used to normalize enous to human serum and were determined to originate from
the observed retention time for every AMRT component. The transferrin (data not shown).31 Next, the algorithm calculates the
effects of the retention time normalization are illustrated in Figure replication rate of each AMRT within and among all conditions.
1B. The output that is generated from the clustering routine is a The algorithm also calculates the average mass, intensity, area,
large matrix, whereby identical components are aligned in each combined charge-state, and retention-time for each AMRT for all
row for subsequent quantitative and statistical analysis. The conditions. In addition, a standard deviation and coefficient of
assembled matrix will not only contain AMRTs which appear in variation is determined for each of these measured attributes.
each of the conditions for each of the replicate injections, but may Using this information, the software annotates those AMRTs
also include those AMRTs which appear reproducibly in one or common and unique to each condition. Last, the algorithm
more of the six conditions. performs binary comparisons for each of the conditions to
To illustrate the level of specificity one is capable of obtaining generate an average normalized intensity ratio (log) for all
with mass accuracy and retention time reproducibility, the matched AMRTs and also performs a Student’s t-test for each
processed data can be queried at different retention time and mass binary comparison. The final results of the clustering algorithm
precision tolerances. As an example, injection 2 of the human can be exported as a comma-delimited text file containing all of
serum with 2 pmol of MPDS protein produced 2582 AMRTs. The the mass spectrometric and chromatographic attributes for each
2582 AMRTs were queried to determine how many were within AMRT, along with all of the mathematical and statistical calcula-
a (1-min retention time window and a 10 ppm mass tolerance. tions generated after the clustering process. This clustered data
Using these tolerances, a total of 36 AMRTs (1.4%) were found to file can be further manipulated or visualized in any of a number
coexist within these parameters. Therefore, these 36 AMRTs could of commercially available software packages, such as Microsoft
potentially add ambiguity during the clustering process and lead Excel or Spotfire Decision Site.
to incorrect clustering of the data. If the mass tolerance is allowed The precision of the extracted mass measurements of the
clustered components from the replicate injections of all samples
to expand to a 100-mDa error, the ambiguity is increases to a total
were typically within (5 ppm of the mean mass measurement.
of 76 AMRTs (2.9%). At 1 Da, nominal mass, the ambiguity
These data are illustrated in Figure 1C and demonstrate the
increases to a total of 657 AMRTs (25.4%). These errors are
robustness of the ion extraction software and the stability of the
compounded if the tolerances of both the retention time and mass
mass measurement instrumentation. In fact, 90% of the total
precision are allowed to expand. If the retention time tolerance is
number of replicated components were measured with a precision
allowed to be within (5 min, then the following statistics are
of (3 ppm. The reproducibility of the quantitative intensity
generated from the single data file: 293 AMRTs (11.3%) at 10 ppm
measurements from the Expression Informatics software is
mass tolerance, 441 AMRTs (17.1%) at 100-mDa mass tolerance,
summarized in Figure 1D. These results indicate that the coef-
and 1112 AMRTs (43.1%) at 1-Da tolerance. These results are
ficient of variation (Cv) among the replicate injections and across
based on a single injection of a single sample. If one were to
multiple samples were typically less than 15%, with a majority of
compare replicates among many different samples, this could lead
the quantitative variation lying between 11 and 14% Cv. These
to a significant number of AMRTs being clustered incorrectly and
observations are typically expected from the Expression Infor-
thereby produce highly irreproducible results. Having an LC/MS
matics software when using standard protocols for efficient sample
instrumentation platform that is capable of providing reproducible
preparation.32
mass precision and accuracy along with reproducible chromatog-
Expression Analysis of AMRT Components. The purpose
raphy will significantly increase the quality of the clustered data
of these experiments was to demonstrate that the Expression
and will provide a more robust quantitative proteomics platform.
Informatics software could ascertain the relative change in
Data Normalization and Statistical Analysis. Once the
abundance of a small subset of proteins (MPDS proteins) spiked
AMRT data have been clustered, the clustering algorithm per-
into a complex protein background (human serum). The MPDS
forms a number of mathematical and statistical calculations for
the entire data set. To correct for injection variability and total (29) Rockwood, A. L.; Fabbi, J. C.; Harris, L.; Davis, L.; Lee, E. D.; Ogden, C.;
Tolley, H.; Gunsay, M.; Sin, J. C. N.; Lee, H. G. Proc. 45th ASMS Conf. Mass
protein load across samples, the intensity measurements for the Spectrom. Allied Top. 1997, WOE 0250, Palm Springs, CA.
entire data set are normalized. The intensity measurements of all (30) Barbacci, D. C.; Russel, D. H.; Schultz, J. A.; Holocek, J.; Ulrich, S.; Burton,
detected AMRTs from each injection are normalized to a set of W.; Van Stipdonk, M. J. Am. Soc. Mass Spectrom. 1998, 9, 1328-1333.
(31) Silva, J. C.; Richardson, K.; Young, P.; Denny, R.; Neeson, K.; McKenna,
AMRTs (endogenous or exogenous) that are known not to have T.; Dorschel, C. A.; Li, G.-L.; Gorenstein, M.; Riley, T.; Geromanos, S. Proc.
changed among the different samples. The internal AMRT 52nd ASMS Conf. Mass Spectrom. Allied Top. 2004, MPX 452, Nashville,
standards used for normalization purposes were required to be TN.
(32) Dorschel, C. A.; Gorenstein, M.; Li, G.-Z.; Silva, J. C.; Geromanos, S.; Riley,
present in all six experiments. Although the Expression Infor- T. Proc. 52nd Ann. ASMS Conf. Mass Spectrom. Allied Top. 2004, TPY 458,
matics software is capable of correcting the mass and intensity Nashville, TN.
Analytical Chemistry, Vol. 77, No. 7, April 1, 2005 2191
6. Figure 2. (A) The base peak intensity (BPI) of human serum with five equimolar exogenous proteins spiked at decreasing levels (5.00, 2.00,
1.00, 0.50, 0.25, and 0.10 pmol), (B) the selected ion chromatogram (SIC) of the doubly charged peptide ion, 724.34 ((0.05 m/z). The
corresponding SICs were integrated using MassLynx processing software between 68.00 and 71 min. Processing parameters were set for
automatic noise measurement, Savitzky-Golay smoothing (three channels, two smoothes), and ApexTrack peak integration. (C) The continuum
mass spectrum at the apex of the corresponding 724.34 selected ion chromatogram in panel B (from 600 to 825 m/z). (D) The lock-mass-
corrected, centroided mass spectrum of the 724.34 isotope cluster (between 722 and 729 m/z) from panel C (smoothing: Savitzky-Golay,
three channels, two smoothes; centering: three channels, centroid top 80%, centered by area) and lock-mass-corrected against the monoisotopic
ion of Glu-Fib, 785.8426 m/z).
proteins were spiked at levels well below that of the most abundant 2C is normalized to the highest ion in the spectrum to illustrate
proteins in the complex background. Six samples were prepared the dilution of the 724.41 MH2+ ion over the six different
to reflect a dilution series of the MPDS proteins ranging from 10 concentrations. The data presented in each spectrum illustrate a
to 500 fmol/µL. The samples were digested with trypsin as very high degree of similarity with respect to the other coeluting
described in the Material and Methods Section, and the resulting peptides in the background of human serum. This similarity is
polypeptide mixtures were analyzed in triplicate by LC/MS.22-24 reflected not only in the number of ions present in each scan but
To demonstrate that the quantitative information relating to the also in the correlation among their respective intensities and
MPDS proteins was available in the acquired LC/MS data, a relative intensity ratios. The degree of chromatographic reproduc-
manual analysis was performed on a previously characterized ibility is further supported, at the global level, from the Expression
AMRT (m/z 724.41 at 69.5 min). Figure 2A depicts six total ion Informatics processing and analysis of the clustered AMRTs
chromatograms (TICs) obtained from the LC/MS acquisitions. obtained from each of the replicate analyses, as will be illustrated
For the sake of space, only one replicate TIC is illustrated for later. Figure 2D depicts each spectrum after it has been smoothed
each of the six different samples. The TICs illustrate a high degree (Savistky-Golay smoothing, three channels, two smoothes),
of similarity among the six different samples, despite an overall centered (three channels, 80% of the centroid top, centered by
50-fold change in the relative levels of MPDS peptides throughout area), and lock-mass corrected against the monoisotopic ion of
the six samples. Figure 2B illustrates the selected ion chromato- GFP (m/z 785.8426). Comparison of the lock-mass-corrected mass
grams (SICs) for the m/z 724.41 (z ) 2, MH2+) ion at ∼69.5 min measurements obtained from the six individual samples (m/z
and the associated integrated peak areas, as determined by 724.41, MH2+) reflects the level of mass precision obtained from
MassLynx. The identity of this peptide was validated by DDA to this methodology. It also establishes that one can use an LC/
use as a proof-of-concept model for the subsequent quantitative MS-based approach for relative quantitation of peptide components
comparison (data not shown, VVGLSTLPEIYEK peptide from in a complex protein sample, provided that sufficient mass and
yeast ADH). Figure 2C illustrates the six individual MS spectra retention time reproducibility are obtained. Table 1 outlines the
obtained from each sample at the chromatographic apex of the results obtained from the manual interrogation of the raw data
SIC in Figure 2B (m/z 724.41). Each spectrum presented in Figure using the commercially available MassLynx software. The inte-
2192 Analytical Chemistry, Vol. 77, No. 7, April 1, 2005
7. Table 1. Summary Table of the Manual and Automated Analysisa
manual processing automated processing
(MassLynx) (Expression Informatics)
human serum + exogenous theoretical calcd error calcd errorl
proteins, pmol ratiob intc MH+ d ppme ratiof (%)g inth MH+ i ppmj ratiok (%)
5.00 1.0 15871 1447.8134 5.9 1.0 545213 1447.8112 -4.4 1.0
2.00 2.5 5498 1447.8082 -2.3 2.9 15.2 205709 1447.8151 -7.1 2.7 8.0
1.00 5.0 2775 1447.8062 -0.9 5.7 14.2 107305 1447.8086 -2.6 5.1 2.0
0.50 10.0 1584 1447.8082 -2.3 10.0 0.1 51992 1447.8089 -2.8 10.5 5.1
0.25 20.0 688 1447.7998 3.5 23.1 15.3 23808 1447.8102 -3.7 22.9 14.4
0.10 50.0 343 1447.8042 0.4 46.3 -7.5 10885 1447.8121 -5.0 50.1 0.2
RMS error 3.1 5.4 4.6 3.5
a The mass measurements and signal response measurements obtained from manual analysis using MassLynx software and automated processing
using the Expression Informatics software for the 1447.8048 monoisotopic ion (at ∼69 min) originating from the VVGLSTLPEIYEK peptide of
Yeast ADH are described in the Table. b The theoretical relative ratio for the spiked ADH peptide. c The integrated peak measurement obtained
using ApexTrack peak integration in Masslynx. d The calculated monoisotopic mass from the lock-mass-corrected measurement of the 12C isotope
of the doubly charged ion cluster. e The corresponding ppm error obtained using the Masslynx software when compared to the theoretical
monoisotopic mass, 1447.8048. f The calculated relative ratio of each condition compared to the 5 pmol condition from the measured peak response.
g The relative percent error between the calculated relative ratio and the theoretical relative ratio. h The integrated peak measurement obtained
using the peak integration algorithm in the Expression Informatics software. i The calculated monoisotopic mass from the lock-mass corrected
measurement of the doubly charged ion cluster using the maximum entropy algorithm in the Expression Informatics software. j The corresponding
ppm error obtained using the Expression Informatics software when compared to the theoretical monoisotopic mass, 1447.8048. k The calculated
relative ratio of each condition compared to the 5 pmol condition from the measured peak response. l The relative percent error between the
calculated relative ratio and the theoretical relative ratio.
grated peak area and accurate mass measurement of the monoiso- human serum peptides throughout the dilution series. Though
topic ion for each sample is indicated in Table 1. In addition, the the data presented in Figures 2 and 3 and Table 1 are quite
observed mass error (ppm) has been determined, along with the encouraging, the challenge hinges on creating a software process-
corresponding calculated response ratios for each of the samples, ing package that is capable of automating the process, whereby
when compared to the 5-pmol sample. Upon manual interrogation hundreds or thousands of TICs can be compared quantitatively.
of the raw continuum data, the overall quantitative accuracy is Table 2 illustrates the number of AMRTs obtained from each
within (10%. The average mass accuracy obtained from MassLynx replicate of each sample, along with the associated combined
for the yeast, ADH peptide (724.41 m/z, z ) 2) was below 5 ppm intensity for all extracted AMRTs (after normalization). The
(RMS). Table 1 illustrates that the information is available in the variability associated with the number of extracted AMRTs is
raw continuum data to display the relative change in abundance presented in Table 2 and illustrates a high degree of reproduc-
of the yeast ADH protein (from 5000 to 100 fmol) in the complex ibility across replicate injections. However, the data also illustrate
background of human serum. The quality of the mass spectro- a steady decrease in the number of AMRTs reported along with
metric data is highlighted in Table 1, which contains the average a decrease in the combined intensity as one examines those
accurate mass measurement and corresponding parts-per-million samples containing the highest concentration of exogenous pro-
error for the test AMRT in each of the separate samples. It also teins to the lowest concentration exogenous proteins. We plotted
includes the average normalized intensity and the corresponding the change in the average number of AMRTs and total intensity
intensity ratios from the manual analysis of the yeast ADH peptide versus the spiked protein concentration for the six samples and
across all the six experiments. found the data to be linear with R2 values of 0.9878 and 0.9838,
The 18 LC/MS experiments were processed with the Expres- respectively (data not shown). Since the background of human
sion Informatics software for a profiling analysis study. The serum proteins should not change from sample to sample, it is
Expression Informatics results of the same AMRT described our contention that the associated y intercepts of 1964 AMRTs
earlier (m/z 724.41 MH2+, 1447.81 MH+) produced an average and 7.0 × 107 intensity counts represent the basal level (number
mass precision error below 5 ppm (4.1 ppm, RMS) and an average and associated intensity) of AMRTs present in the human serum.
quantitative error of ∼5%. The results obtained from the automated The 18 resulting xml files were generated from the continuum
processing of the raw continuum data were, thus, in agreement LC/MS data using the Expression Informatics software and
with the manually obtained data from MassLynx, described above. contained both the mass spectrometric and chromatographic
The response curves generated from the manual and automated attributes for all extracted AMRTs. The xml files were processed
processing of the VVGLSTLEPIYEK tryptic peptide from yeast using the associated clustering algorithm to group identical
ADH is illustrated in Figure 3A. These data demonstrate the AMRTs across the replicate injections for all the six samples. In
consistency between the two data processing methods, whereby the replicate analysis of the human serum with 5 pmol of MPDS
the two normalized response curves are nearly coincident, with protein, 68% of the total AMRTs were replicated in three out of
an overall correlation coefficient of 0.999. The results show the three injections (2577 AMRTs of the 3797 total clustered AMRTs).
linearity of the two data processing methods across the 2 orders The 2577 replicating AMRTs consisted of ∼87% of the total
of magnitude dynamic range inherent in the outlined experiments. detected intensity. The overall trend suggests that the missing
Interestingly, the linear response of the exogenous ADH peptide observations are due to the ion detection threshold parameters.
(724.41 MH2+) seems to illustrate little or no ion suppression Decreasing the stringency to two out of three replicate injections
effects which may have resulted from the high background of resulted in 85% of the total AMRTs and constituted 95% of the
Analytical Chemistry, Vol. 77, No. 7, April 1, 2005 2193
8. Figure 3. (A) The response curves of the doubly charged polypeptide ion (observed 724.34 m/z, VVGLSTLEPIYEK peptide from yeast ADH)
at ∼69 min from manual interrogation and automated processing of the spiked human serum data. The response measurements were normalized
to the maximum observed response from the corresponding dilution series. (B) A subset of 25 response curves obtained from the output of the
clustering tool of the Expression Informatics software. The clustered output file was imported into Spotfire, and the data were parsed by the
average monoisotopic mass from all replicates of each sample using the trellis option in Spotfire. The average monoisotopic mass for each
AMRT component is indicated at the top of each plot. Those AMRTs associated with the human serum (rows 1-4) did not change throughout
the dilution series and are indicated by those response curves with a slope of 0, whereas all of those AMRTs that are associated with the
exogenous proteins have a similar positive slope (row 5). The AMRTs were validated to each of the corresponding exogenous proteins: 1422.7261
MH+, EFTPVLQADFQK (vovine hemoglobin (R-chain)); 1529.7344 MH+, VGAHAGEYGAEALER (bovine hemoglobin (β-chain)); 1576.7762
MH+, LKPDPNTLCDEFK (bovine albumin); 1578.8098 MH+, VDDFLLSLDGTANK (yeast enolase), and 1580.8387 MH+, QIIEQLSSGFFS PK
(rabbit phosphorylase B).
total detected intensity. In the replicate injection of the 5-pmol A total of 1776 AMRTs were found in common to all replicates
condition, the average intensity measurement for those AMRTs of all six samples, constituting an average combined intensity of
which replicated in three out of three injections was 36 666 counts, 7.12 ×107 counts. These results are consistent with the hypothesis
whereas the average intensity measurements for the AMRTs regarding the basal level of the human serum AMRTs found to
which replicated in either two or three out of three injections was replicate among the six samples. Though one may suspect the
13750 and 8411 counts, respectively. Lowering the ion detection total number of AMRTs to be low, considering the complexity of
threshold increases the number of AMRTs reported but also the background of human serum peptides, it should be noted that
lowers the total fraction of replicating AMRTs. In addition, the purpose of this study is to verify that the Expression
lowering the ion detection threshold does not dramatically affect Informatics software identifies the appropriate change in relative
the fraction of total intensity attributed to the replicating AMRTs. abundance among the spiked MPDS peptides. The ion detection
2194 Analytical Chemistry, Vol. 77, No. 7, April 1, 2005
9. Table 2. Summary Table of the Ion Detection Resultsa
sample inj 1 inj 2 inj 3 CV, % inj 1 inj 2 inj 3 CV, %
5 pm ProStds HsSera 2 pm ProStds HsSera
AMRTs 3142 3231 3212 1.47 2382 2582 2758 7.31
normalized intensity 1.04 × 108 1.03 × 108 9.90 × 107 2.59 8.40 × 107 8.58 × 107 8.85 × 107
2.66
1 pm ProStds HsSera 0.5 pm ProStds HsSera
AMRTs 2383 2087 2244 6.62 2005 2062 2106 2.46
normalized intensity 7.61 × 107 8.22 × 107 8.18 × 107 4.27 7.56 × 107 7.46 × 107 7.97 × 107 3.57
0.25 pm ProStds HsSera 0.1 pm ProStds HsSera
AMRTs 2012 1939 2058 3.00 1972 2002 1923 2.03
normalized intensity 8.00 × 107 7.27 × 107 8.12 × 107 5.88 7.79 × 107 7.08 × 107 7.38 × 107 4.81
a The total number of AMRTs is indicated for each replicate analysis of the six human serum samples. The sum of the normalized intensity for
each replicate injection is listed below each of the corresponding total AMRT values. The coefficient of variation of the extracted AMRTs and their
associated normalized intensity is calculated for each replicate injection. The ion detection parameters were set up to extract those multiply charged
ions (charge states between 2 and 6) which exceeded 200 counts (center by area, after deisotoping).
threshold was set to generate AMRTs which spanned 3-4 orders behavior, and by extension, may be related to the same protein,
of magnitude dynamic range within a given sample. The MPDS metabolic, or regulatory pathway(s).33,34
proteins were spiked into the human serum at levels such that Figure 4A illustrates a diagonal plot of the log of the average
their intensities were within this window of dynamic range. By normalized intensity for matched AMRTs from the 5-pmol mixture
applying these threshold parameters, we were able to demonstrate (x axis) versus the 2-pmol mixture (y axis). The data illustrate
the appropriate response with the ADH peptide and, therefore, two distinct clusters of ions spanning close to 4 orders of
continue with the analysis to characterize the remaining AMRTs. magnitude dynamic range in ion detection and share 2997
The clustering results were exported from the Expression matched AMRT component pairs between the two conditions. The
Informatics software and imported directly into Spotfire for data points are colored by their respective t-test score of the
evaluation. With identical components clustered across the normalized intensities for all replicate injections between the two
replicate injections of the six samples (dilution series), one can conditions to illustrate that the variance between the two condi-
readily obtain response curves for each of the clustered compo- tions is statistically significant. The yellow data points illustrate
nents. Figure 3B illustrates response curves for a subset of those matched AMRT components with a t-test score of <0.01,
clustered AMRTs, in which the average normalized intensity is indicating that there is less than a 1% chance that the observed
plotted as a function of the quantity (femtomole) of spiked MPDS change is not due to the applied perturbation. Although more
proteins. The bottom five plots represent an individual peptide sophisticated multicomponent statistical methods could be per-
from four of the remaining five exogenous proteins. All of these formed on these data, all the comparisons in this work were
response curves have a similar slope that is indicative of the performed using a binary Student’s t-test. The t-test was performed
configured serial dilution. The response curves in Figure 3B on only the highly reproducible AMRTs which were found to be
correspond to extracted AMRTs that replicated in all six samples in the majority of the replicates for each of the two test conditions
of human serum with the exogenous proteins. The AMRTs with (at least two out of three). In the presentation of this work, there
the experimentally determined monoisotopic m/z of 1422.7261, was no attempt to correct for missing data. If an AMRT occurred
1529.7344, 1576.7762, 1578.8098, and 1580.8387 represent peptides in only one out of three replicate injections in either of the two
from bovine hemoglobin (β-chain), bovine hemoglobin (R-chain), conditions, the AMRT was ignored in the subsequent quantitative
bovine albumin, yeast enolase, and rabbit phosphorylase B, processing. Since this approach does not require the use of
respectively. The mass accuracies associated with these corre- enrichment techniques, there is quite a bit of peptide redundancy
sponding peptides are all within ( 5 ppm of the theoretical tryptic for each representative protein in the sample. By not limiting the
peptide mass. All of the plots for the remaining AMRTs have a number of peptides per protein, we can afford to use a conservative
slope of 0 and, therefore, correspond to background serum approach to our data reduction scheme and propogate the highest
peptides that do not change in relative concentration across the quality data into the quantitative processing without jeopardizing
six individual samples. For the point of this illustration, the x axis the number of proteins that can be quantified and subsequently
corresponds to the concentration of spiked exogenous proteins. identified. In Figure 4A-E, the blue data points are those AMRTs
In a biomarker discovery study, the concentration dependence that did not exhibit any change due to the applied perturbation
could easily be replaced by a time course or different perturba- as defined by the Student’s t-test (>0.01). The red data point
tions, such as drug dosage or environmental conditions. The ability highlights the AMRT described in Table 1, for the purpose of the
to display these response curves (or conditional profiles) for all manual analysis and comparison to the automated processing.
matched AMRTs enables one to perform comprehensive global
(33) Mirkin, B. Mathematical Classification and Clustering, Nonconvex Optimiza-
comparisons rather than multiple binary comparisons. Using this tion and Its Applications; Pardalos, P., Horst, R., Eds.; Kluwer Academic
approach, the AMRTs can be rapidly screened and characterized Publishers: The Netherlands, 1996, Chapter 11.
on the basis of their collective behavior across the multiple (34) MacQueen, J. Some Methods for Classification and Analysis of Multivariate
Observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical
conditions. Self-organizing maps (SOMs) or k-means clustering Statistics and Probability; Le Cam, L. M., Neyman, J., Eds.; University of
techniques can be used to associate AMRTs that exhibit the same California Press: Berkeley and Los Angeles, CA; Vol 1, pp 281-297.
Analytical Chemistry, Vol. 77, No. 7, April 1, 2005 2195
10. Figure 4. Diagonal plots of the normalized log intensity. (A) Comparison of clustered AMRTs between human serum with 5.0 pmol of exogenous
protein mixture versus human serum with 2.0 pmol of exogenous protein mixture. For each matched AMRT component, the average log intensity from
each condition is plotted along each of the two axes. The data are presented without applying any statistical filters, which are obtained from the clustered
data set. (B) Same comparison as illustrated in Panel A; however, the data have been filtered using a number of the available statistical measures
obtained from the clustering tool of the Expression Informatics software. The data have been filtered to show only those matched AMRTs which were
found to have a coefficient of variation of the normalized intensity of e30% among the replicate injections, (minimum two out of three replicates per
condition), as well as an observed mass precision of e10 ppm among the replicate injections. (C) Comparison of clustered AMRTs between human
serum with 5 pmol of exogenous protein mixture versus human serum with 0.1 pmol of exogenous protein mixture after applying the statistical filter
described above. (D) Comparison of clustered AMRTs between human serum with 5.0 pmol of exogenous protein mixture and human serum with 1.0
pmol of exogenous protein mixture after applying the statistical filter described above. (E) Comparison of clustered AMRTs between human serum with
0.50 pmol of exogenous protein mixture and human serum with 0.25 pmol of exogenous protein mixture after applying the statistical filter described
above. The data presented in all panels are colored by binned probability score (p score) from a binary Student’s t-test. Those AMRTs which had a
probability score of e0.01 are yellow, whereas those that are >0.01 are blue. The red data point corresponds to the monoisotopic ion of 1447.8048, which
originates from the VVGLSTLPEIYEK peptide of yeast ADH. The interpolated black line corresponds to the expected fold change for each binary comparison.
2196 Analytical Chemistry, Vol. 77, No. 7, April 1, 2005
11. Figure 5. (A) A scatter plot of the average normalized intensity of the clustered AMRTs versus their corresponding coefficient of variation
among the replicate injections for human serum spiked with 5 pmol of exogenous protein versus human serum spiked with 2 pmol of exogenous
protein. The blue data points represent 1840 AMRTs which satisfy the statistical filters described in Figure 2B, whereas the red data points
illustrate the 1157 AMRTs that were removed during the filtering process. (B) A histogram plot of the corresponding fold changes determined
among the 1840 AMRTs which met the applied statistical measures.
From this analysis, it is suggested that the yellow data points AMRT from the filtered data versus the observed coefficient of
represent peptides from the MPDS proteins, whereas the blue variation for the entire clustered data set. The blue data points
data points originate from peptides from human serum proteins. are the subset of 1840 AMRTs which meet the statistical
The information that is provided from this methodology allows parameters described above. As expected, the data illustrate that
one to apply user-defined thresholds to the resulting statistical the statistical filtering process had the most significant effect on
analysis performed on any of the experimental attributes relating the lowest intensity AMRTs, since they will be most influenced
to each AMRT cluster, as well as a minimum replication rate within by coeluting AMRTs and will therefore tend to exhibit the highest
and across conditions as a means to extract the highest quality variability (Cv).
data for subsequent quantitative analysis. Figure 4B depicts 1840 Manual inspection of the clustered output of the replicate
(61.4%) of the matched AMRT component pairs from Figure 4A injections of the 5-pmol condition indicated that less than 80 of
after applying a specific set of statistical thresholds to reveal the the AMRTs determined to be found in only one out of three
highest quality data. These statistical measurements are provided replicate injections could have been associated with an AMRT
by the Expression Informatics software and are included in the determined to have replicated in only two out of three replicate
corresponding output file. In this instance, the data were filtered injections. In this particular example, this represents a false
by (1) applying a replication requirement, in which corresponding clustering rate of ∼2%. However, since only the AMRTs found to
AMRTs must exist in at least two out of the three replicate replicate in only one out of three injections are eliminated from
injections for each condition, (2) requiring that the coefficient of the quantitative processing, the information describing these
variation for the normalized intensities of an AMRT be e30% and potentially discarded AMRTs is still captured in those AMRTs
(3) requiring that the mass precision of clustered AMRTs be <10 which occurred in two out of three injections.
ppm across all samples. After applying the statistical thresholds, One of the key features of this methodology is that it is an
1840 of the initial 2997 matched AMRTs (61.4%) remained to unbiased approach. The method does not require prescreening
illustrate the two distinct sets of peptides, the unaffected human of polypeptide pools for those peptides that contain specific amino
serum peptides and the affected MPDS peptides. The breadth of acids. This unbiased approach produces significantly more peptide
each group of ions along the two diagonals is influenced by the ions per protein than some other quantitative methodologies which
degree of variability inherent to the analytical method and will utilize isotope-coded affinity tags. In addition, the quantitative
determine the confidence interval for a specific fold change. nature of this methodology allows the user to apply statistical
Interestingly, the 1840 statistically significant AMRTs represent methods to remove polypeptide ions (AMRTs) that exhibit
>90% of the total average normalized intensity found in each questionable reproducibility from further consideration without
condition. A total of 724 of the 2997 AMRTs were attributed to jeopardizing the ability to find lower level changes. Figure 5B
AMRTs which occurred in only one out of the three replicate depicts a histogram plot of the observed fold change for the 1840
injections, an additional 384 AMRTs had coefficients of variation filtered AMRTs. The data presented illustrate two Gaussian
>30%, and 49 AMRTs had mass precision errors exceeding 10 distributions about the x axis which are centered at values of 1.0
ppm. This indicates that the most variable data are due to the and 2.5. These values correlate with the predicted results for the
lower intensity AMRTs, as can be seen in Figure 5A. Figure 5A serum-related peptides (no change) and the spiked exogenous
depicts a scatter plot of the average normalized intensity of each peptides (2.5-fold change).
Analytical Chemistry, Vol. 77, No. 7, April 1, 2005 2197
12. Figure 4CD represents two additional diagonal plots of the log AMRTs (tryptic peptides) that will exhibit the same change in
average normalized intensity of the 5-pmol mixtures versus both expression within some reasonable tolerance. It is suggested that
the 100-fmol and 1-pmol mixtures. The results from Figure 4C the use of accurate mass in conjunction with the quantitative fold
begin to test the limits of this methodology. At 100 fmol of spiked change provides additional specificity to allow rapid screening of
MPDS protein, we are approaching the limit of detection for the complex protein mixtures for targeted proteins of interest which
300-µm scale chromatography selected for these series of experi- exhibit a change in relative abundance. In instances for which
ments. This can manifest itself in the results by attenuating the further validation is needed, the user has the ability to construct
expected fold change, producing more scatter between the upper a targeted include list for subsequent MS/MS analysis from the
and lower limits of the expected fold change. In addition, it should accurate mass and retention times (AMRTs) obtained from the
be noted that there are a number of peptides from the exogenous LC/MS acquisition. However, the parallel LC/MS and LC/MSE
MPDS proteins that are chemically identical to a subset of the strategy implemented for this analysis contains not only the
human serum proteins. Among these are human serum albumin precursor ion information but also the associated fragment ion
and human hemoglobin. These chemically identical peptides will information from all the observed precursors and allows one to
show an attenuated fold change as a function of their relative identify the precursor ions without having to perform the targeted
abundance over that of the endogenous peptide. Figure 4E MS/MS experiment.31 Low-energy precursor data are collected
illustrates the 250-fmol mixture versus the 100-fmol mixture. These into function 1, while the associated elevated-energy data are
plots illustrate two distinct ion distributions of AMRTs, which collected into the second function. The low-energy precursor ions
correlate with the relative concentration change of the MPDS are associated with their corresponding high-energy fragment ions
proteins between the two samples as well as those unaffected using the obtained chromatographic attributes. In this type of
human serum proteins. The blue data points represent those experiment, the software uses both the low- and elevated-energy
AMRTs that do not show any relative change with statistical data for qualitative assignment.20
significance between the two conditions (human serum proteins); The data presented in this manuscript illustrate that the
the yellow data points represent those peptide components that Expression Informatics software is capable of reducing large sets
do exhibit statistically significant changes between the two of LC/MS analyses from complex protein mixtures to a simple
conditions (MPDS proteins). list of AMRT components that have undergone a change in relative
To confirm the quantitative results illustrated in Figure 4A- abundance due to the applied perturbation. These capabilities are
E, we performed a simple peptide mass fingerprinting search using provided for by the use of the ion detection, clustering, and
the average mass measurement of each AMRT that was found in quantitative functionalities. Having the ability to reduce these
at least two out of three replicate injections from all six conditions complex protein mixtures to a simple list of AMRT components
with a t-test probability score of e0.01 (67 AMRTs in all). We greatly simplifies the problem of properly identifying the proteins
searched a Swissprot database of over 200 000 entries at 5 ppm affected by the applied perturbation. In many cases, a subsequent
mass accuracy with no missed cleavages and required four protein identification from such complex protein mixtures can be
minimum peptides to match. The search results accounted for ascertained from a simple peptide mass fingerprint of the specific
59 of the 67 total AMRTs. The 59 AMRTs identified 47 proteins AMRTs within a given fold change window. To illustrate this
by peptide mass fingerprint, which included the 5 spiked in powerful capability, we conducted a PMF search with only those
proteins (MPDS proteins) as well as 37 isoforms of the MPDS AMRTs present in at least two out of the three replicate injections
proteins from different species, including 23 different isoforms of for all conditions (5000-100 fmol MPDS proteins), with Cv’s of
glycogen phosphorylase. Last, the final five identifications were the associated replicating intensities of under 30%, with a mass
examples of very high molecular weight proteins (>120 kDa) precision of under 10 ppm, and illustrating a fold change with a
which have tryptic peptides with monoisotopic masses in common t-test score of <1% (Figure 4). The PMF search was queried
with the MPDS proteins. The level of redundancy is not surprising, against a human database of 27 000 entries along with the five
since the search was performed using a non-species-specific exogenous proteins and was conducted without considering any
database. In a true biomarker discovery experiment, the peptide missed cleavages and with a mass accuracy of <10 ppm. The PMF
mass fingerprint would most likely be restricted to a nonredundant search returned 33 peptides from rabbit glycogen phosphorylase,
database of a specific organism to reduce the number of isoforms 18 peptides from bovine serum albumin, 14 peptides from yeast
one may obtain from the homology/identity found in a cross- enolase, 12 peptides from yeast alcohol dehydrogenase, 4 peptides
species database. from bovine hemoglobin (R), and 7 peptides from bovine hemo-
If we had spiked the proteins in at different concentrations, globin (β). Among the set of identified exogenous proteins, the
we could have used the quantitative fold change of the AMRTs peptide VVGLSTLEPIYEK (1447.8048 MH+) was among the 12
as an additional filter or scoring mechanism to eliminate the peptides matched to yeast alcohol dehydrogenase. This peptide
wrongfully assigned high molecular weight protein assignments. was one of the most intense peptides from yeast ADH and
We also suggest that the use of accurate mass in conjunction with exhibited a linear response when spiked into human serum, from
fold change is a powerful strategy for MS-based protein identifica- 100 to 5000 fmol (Figure 3A).
tion. Since enzymatically digested proteins typically produce many Figure 6 shows the 19 peptides matched from bovine serum
peptides and this methodology does not limit the number of albumin via the PMF search. The average normalized intensity
observed peptides per protein through the use of any type of values are plotted for each of the albumin peptides for each of
affinity capture enrichment protocol, proteins which exhibit a the six conditions (5000 to 100 fmol, on column). It is clear from
relative fold change in expression will produce a number of this illustration that not all peptides ionized with sufficient
2198 Analytical Chemistry, Vol. 77, No. 7, April 1, 2005
13. Figure 6. Intensity profiles for characterized bovine serum albumin peptides. The AMRTs which originated from bovine serum albumin were
identified by PMF. The identities of the corresponding bovine serum albumin AMRTs were correlated to the clustered output file. The average
intensity measurements for each of the bovine serum albumin peptides (AMRTs) are plotted from each of the six conditions.
efficiency to be detected in all six conditions. In addition, as the mass spectrometric attributes from an LC/MS analysis of a
concentration of the protein was decreased, the number of complex protein digest in a quantitatively reproducible manner.
detected peptides decreased in a predictable manner. The Additionally, the ion detection and clustering capabilities provided
observed peptides exhibit a characteristic ionization pattern that in the Expression Informatics software demonstrated that one can
is consistent throughout the six experiments. The continuity of monitor slight changes in relative abundance among different
the ionization pattern illustrates the level of reproducibility one conditions without requiring the use of isotopic or metabolic
can obtain with ESI-mass spectrometry. This ionization pattern labeling strategies. The analytical protocols employed in this study
serves as a characteristic feature (ionization map) for the tryptic demonstrate that the combination of accurate mass and chro-
peptides of bovine serum albumin and can be taken into account matographic retention time, in conjunction with other measured
for future characterization of this protein. These results indicate attributes, such as fold change and ion intensities, can provide a
that the two bovine serum albumin peptides, LGEYGFQNALIVR unique signature for each peptide contained in a complex protein
and HLVDEPQNLIK, are the two most efficiently ionized tryptic digest mixture. We believe that there is ample literature precedent
peptides. Using this information, it is not surprising that the least to indicate that electrospray time-of-flight mass spectrometry is
intense peptides observed at the 5-pmol level are not observed at clearly capable of producing quantitative results for peptide ions
either the 100 or 250 fmol level (LCVLHEK, EACFAVEGPK,
over 3 orders of magnitude in concentration. We contend that
DLGEEHFK, and AEFVEVTK). If the ionization pattern for a given
the Expression Informatics software is capable of extracting AMRT
protein was known, one could predict which peptides should be
information at the low end of detection, provided that ion statistics
present at a given concentration of protein. Additionally, using
support an accurate mass measurement and produce a defined
this information, the PMF assignment could be validated further
chromatographic apex. The experiments outlined in this manu-
by correlating the observed ionization pattern to the known tryptic
script were performed over a period of 36 h. An additional set of
peptide ionization pattern for that particular protein. The ultimate
experiments were performed with the dilution series of the MPDS
goal for this type of approach would be to create ionization maps
proteins alone over the same 36-h time frame for the purpose of
for all proteins in a proteome database. If this could be ac-
another topic. The quantitative results from the two individual sets
complished, identifying proteins by mass, retention time, fold
of experiments indicated that the methodology is a robust method
change and ionization efficiency would become an exercise in
accounting (or ion accounting). for global protein profiling.
The demonstrated ability to generate response curves for the
CONCLUSION human serum and the exogenous protein peptides suggests that
The purpose of the this work was to illustrate that the the sample preparation and data acquisition were quantitatively
Expression Informatics software could reduce the complexity of reproducible, with average Cv’s of <15%. Although one can use
the continuum LC/MS data to a list of AMRT components that affinity capture techniques to enrich samples for peptides contain-
have undergone a statistically significant change in relative ing specific amino acid residues and thereby simplify the polypep-
abundance due to the applied perturbation. The Expression tide pool, we have shown that this is not necessary to obtain
Informatics software is capable of extracting chromatographic and accurate quantitation. On the contrary, our methodology provides
Analytical Chemistry, Vol. 77, No. 7, April 1, 2005 2199
14. access to more peptides per protein and allows one to establish ions that were detected in the elevated energy function. Although
high confidence levels for each quantified protein. it is not described in this manuscript, the additional information
The peptide components which exhibit significant up- or down- provided in the elevated energy function affords additional
regulation can be further investigated by conducting a modification specificity for each of the detected precursors in the low energy
of the traditional peptide mass fingerprint analysis. One can function. Although changing the chromatography column may
maximize the information obtained from the clustered AMRT cause a slight shift in the observed retention time, the associated
analysis by recognizing that a relative change in abundance for a elevated energy accurate mass measurements will allow one to
particular protein will manifest itself by producing multiple peptide manage the data properly across multiple experiments. As
fragments which should exhibit the same relative change in described in this work, the precursor information obtained in this
abundance. Using the quantitative information available from the mode is quantitative and reproducible. A more detailed explanation
clustered AMRT analysis, the user can choose to submit for PMF of the alternate scanning methodology is described in the
identification, only those accurate mass measurements which following study by Silva and co-workers31 and will be the topic of
exhibit the proper fold change. Organizing the AMRTs by future work.
observed fold change for subsequent PMF identification is quite
empowering, since it provides additional stringency to the qualita- ACKNOWLEDGMENT
tive identification of a protein that is quantitatively consistent with The authors acknowledge the valuable contributions of Timo-
the data. For those users who require structural information for thy Riley and Bob Bateman throughout the development of this
qualitative peptide/protein assignment, the list of AMRTs can be work. The authors also acknowledge Jeanne Li for her contribu-
used to organize a targeted include list for subsequent peptide tions in the laboratory and throughout the editing of this
identification studies by traditional methods, such as targeted MS/ manuscript. Last, we extend our gratitude to our collaborators
MS. Using the accurate mass and retention time information who helped develop the Expression Informatics software by
obtained from the AMRT analysis, as well as the associated embracing the methodology and applying themselves to demon-
quantitative and statistical analysis, one can carry out a targeted strate its utility (Stanely Hefta, Ashok Dongre, Gregory Opiteck,
MS/MS analysis to identify only those AMRTs that have under- Martin Wiedmann, Deborah H. Smith, Arthur Moseley, Kevin
gone a statistically significant change in relative abundance Blackburn, Danie Schlatzer, Craig A. Townsend, Minerva Hughes,
between conditions. This would eliminate the accumulation of MS/ Christopher T. Walsh, and Jun Yin).
MS data on proteins that are not affected in specific studies and
would allow one to maximize the efficiency of the MS/MS data SUPPORTING INFORMATION AVAILABLE
collection per unit time in a biomarker discovery setting. However, Four posters that were presented at the 52nd ASMS Confer-
the initial LC/MS experiments could have been acquired using ence on Mass Spectrometry and Allied Topics, 2004, Nashville,
the alternate scanning methodology described in this work, in TN (see explanations of refs 27, 28, 31, 32 in text) are available
which the collision energy alternates between low and elevated as Supporting Information. This material is available free of charge
energy throughout the entire LC/MS analysis to capture both via the Internet at http://pubs.acs.org.
precursor and associated fragment ion information in one experi-
ment. Precursor information is captured in one function under
low-energy conditions, and the associated fragment ions are Received for review October 19, 2004. Accepted January
captured in a second function under elevated-energy conditions. 13, 2005.
Each reported precursor will have an associated set of fragment AC048455K
2200 Analytical Chemistry, Vol. 77, No. 7, April 1, 2005