4. Review of Literature
6. Tools and Software
7. Observation and Results
I express my heartfelt thanks to Assistant Prof. Mr. Dharmendra Giri and Head of the
Department of “KURUKSHETRA UNIVERSITY” for providing me an opportunity to do
my internship in SYNMEDICS LABORATORY PVT. LMD.,FARIDABAD.
I also like to thank the Directors of “SYNMEDICS LABORATORY PVT. LTD.” for
granting me such a nice opportunity to work in a nice institute and for providing all
necessary facilities to carry out my project.
I take this opportunity to express my deep sense of gratitude to the faculty members who
helped me to complete my training and my project especially Mr. D.S TOMAR who gave
memorable support and guidance in each and every step of my work without which would
not have been possible to complete my project work on “In-silico Genomic Analysis of
genes, In-silico competitive inhibition of herbal plant extracts on CD-40 gene and Quality
control of Anti-Rheumatoid drugs”
I would like to thank Mr. V.K TYAGI and Mr. ISH JAIN who always been an imitable and
have enlightened me by sharing their knowledge and facilitating with all the information
required for carrying out my project.
Rheumatoid arthritis (RA) is an autoimmune disease that is strongly associated with the
expression of several HLA-DRB1, CD40, IL2, PADI4 and STAT4. Here I have demonstrated
the genome analysis of these 5 genes responsible for Rheumatoid Arthritis. I found that these
genes have property of antigencity and can trigger immune system when even a small
quantity of ligand concentration is available. I observed these genes with various tools like
CLC WORKBENCH AND GENEIOUS PRO. After genome analysis i performed
competitive inhibition of chemical constituents of natural herbs like nirgundi and bhrami, etc.
I found very good interaction with 44 ligands out of 46 ligands taken. I also performed
quality checking of anti rheumatic drugs at Synmedics Laboratory and found the exact
proportions of combinational drugs given in Rheumatoid Arthritis.
Bioinformatics, which uses computer databases to store, retrieve and assist in understanding
biological information. Genome-scale sequencing projects have led to an explosion of genetic
sequences available for automated analysis. These gene sequences are the codes, which direct
the production of proteins that in turn regulate all life processes. The student will be shown
how these sequences can lead to a much fuller understanding of many biological processes
allowing pharmaceutical and biotechnology companies to determine for example new drug
targets or to predict if particular drugs are applicable to all patients. Students will be
introduced to the basic concepts behind Bioinformatics and Computational Biology tools.
Hands-on sessions will familiarize students with the details and use of the most commonly
used online tools and resources.
The course will cover the use of NCBI's Entrez, BLAST, PSI-BLAST, ClustalW, Pfam,
PRINTS, BLOCKS, Prosite and the PDB. An introduction to database design and the
principles of programming languages will be predicted.
In all areas of biological and medical research, the role of the computer has been dramatically
enhanced in the last five to ten year period. While the first wave of computational analysis
did focus on sequence analysis, where many highly important unsolved problems still remain,
the current and future needs will in particular concern sophisticated integration of extremely
diverse sets of data. These novel types of data originate from a variety of experimental
techniques of which many are capable of data production at the levels of entire cells, organs,
organisms, or even populations. The
main driving force behind the changes has been the advent of new, efficient experimental
techniques, primarily DNA sequencing, that have led to an
exponential growth of linear descriptions of protein, DNA and RNA molecules. Other new
data producing techniques work as massively parallel versions of traditional experimental
methodologies. Genome-wide gene expression measurements using DNA microrarrays is, in
essence, a realization of tens of thousands of Northern blots. As a result, computational
support in experiment design, processing of results and interpretation of results has become
Major Research Areas:
Since the Phage Φ-X174 was sequenced in 1977, the DNA sequences of hundreds of
organisms have been decoded and stored in databases. These data are analyzed to determine
genes that code for proteins, as well as regulatory sequences. A comparison of genes within a
species or between different species can show similarities between protein functions, or
relations between species (the use of molecular systematics to construct phylogenetic trees).
With the growing amount of data, it long ago became impractical to analyze DNA sequences
manually. Today, computer programs are used to search the genome of thousands of
organisms, containing billions of nucleotides. These programs would compensate for
mutations (exchanged, deleted or inserted bases) in the DNA sequence, in order to identify
sequences that are related, but not identical. A variant of this sequence alignment is used in
the sequencing process itself. The so-called shotgun sequencing technique (which was used,
for example, by The Institute for Genomic Research to sequence the first bacterial genome,
Haemophilus influenzae) does not give a sequential list of nucleotides, but instead the
sequences of thousands of small DNA fragments (each about 600-800 nucleotides long). The
ends of these fragments overlap and, when aligned in the right way, make up the complete
genome. Shotgun sequencing yields sequence data quickly, but the task of assembling the
fragments can be quite complicated for larger genomes. In the case of the Human Genome
Project, it took several months of CPU time (on a circa-2000 vintage DEC Alpha computer)
to assemble the fragments. Shotgun sequencing is the method of choice for virtually all
genomes sequenced today, and genome assembly algorithms are a critical area of
Another aspect of bioinformatics in sequence analysis is the automatic search for genes and
regulatory sequences within a genome. Not all of the nucleotides within a genome are genes.
Within the genome of higher organisms, large parts of the DNA do not serve any obvious
purpose. This so-called junk DNA may, however, contain unrecognized functional elements.
Bioinformatics helps to bridge the gap between genome and proteome projects--for example,
in the use of DNA sequences for protein identification.
In the context of genomics, annotation is the process of marking the genes and other
biological features in a DNA sequence. The first genome annotation software system was
designed in 1995 by Dr. Owen White, who was part of the team that sequenced and analyzed
the first genome of a free-living organism to be decoded, the bacterium Haemophilus
influenzae. Dr. White built a software system to find the genes (places in the DNA sequence
that encode a protein), the transfer RNA, and other features, and to make initial assignments
of function to those genes. Most current genome annotation systems work similarly, but the
programs available for analysis of genomic DNA are constantly changing and improving.
Computational evolutionary biology
Evolutionary biology is the study of the origin and descent of species, as well as their change
over time. Informatics has assisted evolutionary biologists in several key ways; it has enabled
trace the evolution of a large number of organisms by measuring changes in their
DNA, rather than through physical taxonomy or physiological observations alone,
more recently, compare entire genomes, which permits the study of more complex
evolutionary events, such as gene duplication, lateral gene transfer, and the prediction of
bacterial speciation factors,
build complex computational models of populations to predict the outcome of the
system over time
track and share information on an increasingly large number of species and organisms
Future work endeavours to reconstruct the now more complex tree of life.
The area of research within computer science that uses genetic algorithms is sometimes
confused with computational evolutionary biology, but the two areas are unrelated.
Biodiversity of an ecosystem might be defined as the total genomic complement of a
particular environment, from all of the species present, whether it is a biofilm in an
abandoned mine, a drop of sea water, a scoop of soil, or the entire biosphere of the planet
Earth. Databases are used to collect the species names, descriptions, distributions, genetic
information, status and size of populations, habitat needs, and how each organism interacts
with other species. Specialized software programs are used to find, visualize, and analyze the
information, and most importantly, communicate it to other people. Computer simulations
model such things as population dynamics, or calculate the cumulative genetic health of a
breeding pool (in agriculture) or endangered population (in conservation). One very exciting
potential of this field is that entire DNA sequences, or genomes of endangered species can be
preserved, allowing the results of Nature's genetic experiment to be remembered in silico, and
possibly reused in the future, even if that species is eventually lost.
Analysis of gene expressionThe expression of many genes can be determined by measuring
mRNA levels with multiple techniques including microarrays, expressed cDNA sequence tag
(EST) sequencing, serial analysis of gene expression (SAGE) tag sequencing, massively
parallel signature sequencing (MPSS), or various applications of multiplexed in-situ
hybridization. All of these techniques are extremely noise-prone and/or subject to bias in the
biological measurement, and a major research area in computational biology involves
developing statistical tools to separate signal from noise in high-throughput gene expression
studies. Such studies are often used to determine the genes implicated in a disorder: one
might compare microarray data from cancerous epithelial cells to data from non-cancerous
cells to determine the transcripts that are up-regulated and down-regulated in a particular
population of cancer cells.
Analysis of regulation
Regulation is the complex orchestration of events starting with an extra-cellular signal and
ultimately leading to an increase or decrease in the activity of one or more protein molecules.
Bioinformatics techniques have been applied to explore various steps in this process. For
example, promoter analysis involves the elucidation and study of sequence motifs in the
genomic region surrounding the coding region of a gene. These motifs influence the extent to
which that region is transcribed into mRNA. Expression data can be used to infer gene
regulation: one might compare microarray data from a wide variety of states of an organism
to form hypotheses about the genes involved in each state. In a single-cell organism, one
might compare stages of the cell cycle, along with various stress conditions (heat shock,
starvation, etc.). One can then apply clustering algorithms to that expression data to
determine which genes are co-expressed. For example, the upstream regions (promoters) of
co-expressed genes can be searched for over-represented regulatory elements.
Analysis of protein expression
Protein microarrays and high throughput (HT) mass spectrometry (MS) can provide a
snapshot of the proteins present in a biological sample. Bioinformatics is very much involved
in making sense of protein microarray and HT MS data; the former approach faces similar
problems as with microarrays targeted at mRNA, the latter involves the problem of matching
large amounts of mass data against predicted masses from protein sequence databases, and
the complicated statistical analysis of samples where multiple, but incomplete peptides from
each protein are detected.
Analysis of mutations in cancer
Massive sequencing efforts are currently underway to identify point mutations in a variety of
genes in cancer. The sheer volume of data produced requires automated systems to read
sequence data, and to compare the sequencing results to the known sequence of the human
genome, including known germline polymorphisms. Oligonucleotide microarrays, including
comparative genomic hybridization and single nucleotide polymorphism arrays, able to probe
simultaneously up to several hundred thousand sites throughout the genome are being used to
identify chromosomal gains and losses in cancer. Hidden Markov model and change-point
analysis methods are being developed to infer real copy number changes from often noisy
data. Further informatics approaches are being developed to understand the implications of
lesions found to be recurrent across many tumors.
Some modern tools (e.g. Quantum 3.1 ) provide tool for changing the protein sequence at
specific sites through alterations to its amino acids and predict changes in the bioactivity after
Prediction of protein structure
Protein structure prediction is another important application of bioinformatics. The amino
acid sequence of a protein, the so-called primary structure, can be easily determined from the
sequence on the gene that codes for it. In the vast majority of cases, this primary structure
uniquely determines a structure in its native environment. (Of course, there are exceptions,
such as the bovine spongiform encephalopathy - aka Mad Cow Disease - prion.) Knowledge
of this structure is vital in understanding the function of the protein. For lack of better terms,
structural information is usually classified as one of secondary, tertiary and quaternary
structure. A viable general solution to such predictions remains an open problem. As of now,
most efforts have been directed towards heuristics that work most of the time.
One of the key ideas in bioinformatics is the notion of homology. In the genomic branch of
bioinformatics, homology is used to predict the function of a gene: if the sequence of gene A,
whose function is known, is homologous to the sequence of gene B, whose function is
unknown, one could infer that B may share A's function. In the structural branch of
bioinformatics, homology is used to determine which parts of a protein are important in
structure formation and interaction with other proteins. In a technique called homology
modeling, this information is used to predict the structure of a protein once the structure of a
homologous protein is known. This currently remains the only way to predict protein
One example of this is the similar protein homology between hemoglobin in humans and the
hemoglobin in legumes (leghemoglobin). Both serve the same purpose of transporting
oxygen in the organism. Though both of these proteins have completely different amino acid
sequences, their protein structures are virtually identical, which reflects their near identical
The core of comparative genome analysis is the establishment of the correspondence between
genes (orthology analysis) or other genomic features in different organisms. It is these
intergenomic maps that make it possible to trace the evolutionary processes responsible for
the divergence of two genomes. A multitude of evolutionary events acting at various
organizational levels shape genome evolution. At the lowest level, point mutations affect
individual nucleotides. At a higher level, large chromosomal segments undergo duplication,
lateral transfer, inversion, transposition, deletion and insertion. Ultimately, whole genomes
are involved in processes of hybridization, polyploidization and endosymbiosis, often leading
to rapid speciation. The complexity of genome evolution poses many exciting challenges to
developers of mathematical models and algorithms, who have recourse to a spectra of
algorithmic, statistical and mathematical techniques, ranging from exact, heuristics, fixed
parameter and approximation algorithms for problems based on parsimony models to Markov
Chain Monte Carlo algorithms for Bayesian analysis of problems based on probabilistic
Modeling biological systems
Systems biology involves the use of computer simulations of cellular subsystems (such as the
networks of metabolites and enzymes which comprise metabolism, signal transduction
pathways and gene regulatory networks) to both analyze and visualize the complex
connections of these cellular processes. Artificial life or virtual evolution attempts to
understand evolutionary processes via the computer simulation of simple (artificial) life
High-throughput image analysis
Computational technologies are used to accelerate or fully automate the processing,
quantification and analysis of large amounts of high-information-content biomedical imagery.
Modern image analysis systems augment an observer's ability to make measurements from a
large or complex set of images, by improving accuracy, objectivity, or speed. A fully
developed analysis system may completely replace the observer. Although these systems are
not unique to biomedical imagery, biomedical imaging is becoming more important for both
diagnostics and research. Some examples are:
high-throughput and high-fidelity quantification and sub-cellular localization (high-
content screening, cytohistopathology)
clinical image analysis and visualization
determining the real-time air-flow patterns in breathing lungs of living animals
quantifying occlusion size in real-time imagery from the development of and recovery
during arterial injury
making behavioral observations from extended video recordings of laboratory animals
infrared measurements for metabolic activity determination
As of September 2007, the complete sequence was known of about 1879 viruses , 577
bacterial species and roughly 23 eukaryote organisms, of which about half are fungi.
Genomics is the study of the genomes of organisms. The field includes intensive efforts to
determine the entire DNA sequence of organisms and fine-scale genetic mapping efforts.
The field also includes studies of intragenomic phenomena such as heterosis, epistasis,
pleiotropy and other interactions between loci and alleles within the genome. In contrast, the
investigation of the roles and functions of single genes is a primary focus of molecular
biology or genetics and is a common topic of modern medical and biological research.
Research of single genes does not fall into the definition of genomics unless the aim of this
genetic, pathway, and functional information analysis is to elucidate its effect on, place in,
and response to the entire genome's networks.
For the United States Environmental Protection Agency, "the term "genomics" encompasses
a broader scope of scientific inquiry associated technologies than when genomics was
initially considered. A genome is the sum total of all an individual organism's genes. Thus,
genomics is the study of all the genes of a cell, or tissue, at the DNA (genotype), mRNA
(transcriptome), or protein (proteome) levels."
Genomics was established by Fred Sanger when he first sequenced the complete genomes of
a virus and a mitochondrion. His group established techniques of sequencing, genome
mapping, data storage, and bioinformatic analyses in the 1970-1980s.
A major branch of genomics is still concerned with sequencing the genomes of various
organisms, but the knowledge of full genomes has created the possibility for the field of
functional genomics, mainly concerned with patterns of gene expression during various
The most important tools here are microarrays and bioinformatics. Study of the full set of
proteins in a cell type or tissue, and the changes during various conditions, is called
A related concept is materiomics, which is defined as the study of the material properties of
biological materials (e.g. hierarchical protein structures and materials, mineralized biological
tissues, etc.) and their effect on the macroscopic function and failure in their biological
context, linking processes, structure and properties at multiple scales through a materials
The actual term 'genomics' is thought to have been coined by Dr. Tom Roderick, a geneticist
at the Jackson Laboratory (Bar Harbor, ME) at a meeting held in Maryland on the mapping of
the human genome in 1986.
In 1972, Walter Fiers and his team at the Laboratory of Molecular Biology of the University
of Ghent (Ghent, Belgium) were the first to determine the sequence of a gene: the gene for
Bacteriophage MS2 coat protein. In 1976, the team determined the complete nucleotide-
sequence of bacteriophage MS2-RNA. The first DNA-based genome to be sequenced in its
entirety was that of bacteriophage Φ-X174; (5,368 bp), sequenced by Frederick Sanger in
The first free-living organism to be sequenced was that of ''Haemophilus influenzae'' in 1995,
and since then genomes are being sequenced at a rapid pace.
Most of the bacteria whose genomes have been completely sequenced are problematic
disease-causing agents, such as ''Haemophilus influenzae''. Of the other sequenced species,
most were chosen because they were well-studied model organisms or promised to become
good models. Yeast (''Saccharomyces cerevisiae'') has long been an important model
organism for the eukaryotic cell, while the fruit fly ''Drosophila melanogaster'' has been a
very important tool (notably in early pre-molecular genetics). The worm ''Caenorhabditis
elegans'' is an often used simple model for multicellular organisms. The zebrafish
''Brachydanio rerio'' is used for many developmental studies on the molecular level and the
flower ''Arabidopsis thaliana'' is a model organism for flowering plants. The Japanese
pufferfish (''Takifugu rubripes'') and the spotted green pufferfish (''Tetraodon nigroviridis'')
are interesting because of their small and compact genomes, containing very little non-coding
DNA compared to most species.
A rough draft of the human genome was completed by the Human Genome Project in early
2001, creating much fanfare. By 2007 the human sequence was declared "finished" (less than
one error in 10,000 bases and all chromosomes assembled. Display of the results of the
project required significant bioinformatics resources. The sequence of the human reference
assembly can be explored using the UCSC Genome Browser.
TYPES OF GENOMICS:
In the last few years some interesting findings have been recorded and several new branches
have emerged. Consequently, the area of genomics has quitely widened. However, the
genomics is broadly categorised into two, structural genomics and functional genomics.
1. Structural Genomics :
The structural genomics deals with DNA sequencing, sequence assembly, sequence
organisation and management. Basically it is the starting stage of genome analysis i.e.
construction of genetic, physical or sequence maps of high resolution of the organism. The
complete DNA sequence of an organism is its ultimate physical map. Due to rapid
advancement in DNA technology and completion of several genome sequencing projects for
the last few years, the concept of structural genomics has come to a stage of transition.
Now it also includes systematic and determination of 3D structure of proteins found in living
cells. Because proteins in every group of individuals vary and so there would also be
variations in genome sequences.
2. Functional Genomics :
Based on the information of structural genomics the next step is to reconstruct genome
sequences and to find out the function that the genes do. This information also lends support
to design experiment to find out the functions that specific genome does. The strategy of
functional genomics has widened the scope of biological investigations. This strategy is
based on systematic study of single gene/ protein to all genes/proteins. Therefore, the large
scale experimental methodologies (along with statistically analysed/computed results)
characterise the functional genomics. Hence, the functional genomics provide the novel
information about the genome. This eases the understanding of genes and function of
proteins, and protein interactions. The wealth of knowledge about this untold story is being
unraveled by the scientists after the development of microarray technology and proteomics.
These two technologies helped to explore the instantaneous events of all the genes expressed
in a cell/ tissue present at varying environmental conditions like temperature, pH, etc.
GENOMICS APPLICATIONS :
1. Functional genomics
2. Gene identification
3. Comparitive genomics
4. Genome wide expression analysis
Main focus of the genome analysis will be the microarray technique and the
preprocessing and analysis methods associated with it.
The microarray technique generates a gene expression profile which gives the
expression states of genes in a cell by reporting the mRNA concentration. The mRNA
concentration in turn reports the cell status determined by what and how many proteins
are currently produced. The DNA microarray technologies such as cDNA and
oligonucleotide arrays provide means of measuring tens of thousands of genes
simultaneously (a snapshot of the cell). The microarrays are a large scale high-throughput
method for molecular biological experimentation.
The information obtained by recognizing genes that share expression patterns and hence
might be regulated together are assumed to be in the same genetic pathway. Therefore the
microarray technique helps to understand the dynamics and regulation behavior in a cell.
One of the goals of microarray technology is the detection of genes that are differentially
expressed in tissue samples like healthy and cancerous tissues to see which genes are
relevant for cancer. It has important applications in pharmaceutical and clinical research and
helps in understanding gene regulation and interactions. Genome analysis includes also
genome anatomy and genome individuality (e.g. repetitions or single nucleotide
polymorphism). We will address also actual genomic research questions about alternative
splicing and nucleosome position.
Genomics – Main Research Areas
With the completion of the entire human DNA sequence, an important stage of genomic
research has come to an end. However, the scientists still have plenty of work in genomics
such as to determine the role of genomic variations on cell function and establish each gene
in the context of the entire genome. In addition to human genome, other species’ genomes are
being studied as well in order to better understand biology as well as to be able to use the
knowledge of genomics to improve human health and to treat diseases.
The advances in genomics were followed by advances in technology which enables the
scientists to conduct more complex researches but it also led to specialization within the field
of genomics. Thus the study of the genomes has been divided into several research areas with
an emphasis on:
As already mentioned earlier, the main goal of genomics in the field of human genome – to
sequence the entire human DNA has been completed in 2007 when the human genome
sequencing has been declared finished by the Human Genome Project. Now, however, the
scientists have to interpret the data and make the knowledge useful for practical applications
such as treating and preventing diseases.
This field of genomic research is focused on the study of bacteriophages or bacteria which
infect the viruses. In the past, bacteriophages were also used to define gene structure and it
was a bacteriophage whose genome was sequenced first.
It refers to the study of cyanobacteria, also known as blue-green algae which get their energy
through the process of photosynthesis. This phylum of bacteria is thought to play an
important role in shaping the Earth’s atmosphere and biodiversity of life on our planet by
it is the study of metagenomes or material that is obtained directly from the environmental
samples. Genome sequencing of the cultures which were taken from the environment has
shown that the traditional research on cultures which was based on cultivated clonal cultures
has missed most of microbial diversity. Metagenomics has revealed many previously
unknown characteristics of the microbial world. As a result, this field of genomic research
has a great potential to revolutionize not only the understanding of the microbial world but
the entire living world.
This field of genomics is focused on interpretation of the data created by genomic research
projects in order to describe functions and interaction of a gene. In contrary to genomics
which is mainly focused on obtaining information from DNA sequence, functional genomics
is also interested in the DNA on the levels of the genes. However, it does not use gene-by-
gene approach but rather a genome-wide method.
It is a branch of pharmacology and genomics which researches the relationship between drug
response and genetic variation in order to develop drug therapy which ensures optimal
efficacy and minimum risk of side affects in respect to the patient’s genotype. This approach
has been shown to be very helpful in treating conditions such as cancer, cardiovascular
disease, diabetes, asthma and depression.
Drug design, sometimes referred to as rational drug design or more simply rational design, is
the inventive process of finding new medications based on the knowledge of a biological
target.The drug is most commonly an organic small molecule that activates or inhibits the
function of a biomolecule such as a protein, which in turn results in a therapeutic benefit to
the patient. In the most basic sense, drug design involves the design of small molecules that
are complementary in shape and charge to the biomolecular target with which they interact
and therefore will bind to it. Drug design frequently but not necessarily relies on computer
modeling techniques. This type of modeling is often referred to as computer-aided drug
Drug discovery and development is an intense, lengthy and interdisciplinary endeavour. Drug
discovery is mostly portrayed as a linear, consecutive process that starts with target and lead
discovery, followed by lead optimization and pre- clinical in-vitro and in-vivo studies to
determine if such compounds satisfy a number of pre-set criteria for initiating clinical
Typically a drug target is a key molecule involved in a particular metabolic or signalling
pathway that is specific to a disease condition or pathology, or to the infectivity or survival of
a microbial pathogen. Some approaches attempt to stop the functioning of pathway in the
diseased state by causing a key molecule to stop functioning. Drugs may be designed that
bind to the active region and inhibit this key molecule. However, these drugs would also have
to design in such a way as not to affect any other important molecules that may be similar in
appearance to the key molecules. Sequence homologies are often used to identify such risks.
Other approaches may be to enhance the normal pathway by promoting specific molecules in
the normal pathways that may have been affected in the diseased state. For the
pharmaceutical industries the number of years to bring a drug from discovery to market is
approximately 12-14 years and costing upto 41.2-$1.4 billion dollars. Traditionally, drugs
were synthesizing compounds in a time consuming multi step processes against a battery of
in vivo biological screens and further investigating the promising candidates for their
pharmacokinetic properties, metabolism and potential toxicity. Such a development process
has resulted in high attrition rates with failures attributed to poor pharmacokinetics
(39%),lack of efficacy (30%), animal toxicity (11%), adverse effects in human (10%0 and
various commercial and miscellaneous factors. Today , the process of drug discovery has
been revolutionised with the advent of genomics, proteomic, bioinformatics and efficient
technologies like, combinatorial chemistry, high throughput screening (HTS), virtual
screening, de-novo design, in vitro, in-silico, ADMET screening and structure based drug
In-silico Drug Designing
In-silico methods can help in identifying drug targets via bioinformatics tools.
They can also be used to analyze the target structures for possible binding/ active sites,
generate candidate molecules, check for their drug likeness, dock these molecules with the
target, rank them according to their binding affinities, further optimize the molecules to
improve binding characteristics.
The use of computers and computational methods permeates all aspects of drug discovery
today and forms the core of structure-based drug design. High-performance computing, data
management software and internet are facilitating the access of huge amount of data
generated and transforming the massive complex biological data into workable knowledge in
modern day drug discovery process. The use of complementary experimental and informatics
techniques increases the chance of success in many stages of the discovery process, from the
identification of novel targets and elucidation of their functions to the discovery and
development of lead compounds with desired properties. Computational tools offer the
advantage of delivering new drug candidates more quickly and at a lower cost. Major roles of
computation in drug discovery are;
(1) Virtual screening & de novo design,
(2) In silico ADME/T prediction and
(3) Advanced methods for determining protein-ligand binding
Significance of insilico drug design
As structures of more and more protein targets become available through crystallography,
NMR and bioinformatics methods, there is an increasing demand for computational tools that
can identify and analyze active sites and suggest potential drug molecules that can bind to
these sites specifically. Also to combat life-threatening diseases such as AIDS, Tuberculosis,
Malaria etc., a global push is essential. Millions for Viagra and pennies for the diseases of the
poor is the current situation of investment in Pharma R&D. Time and cost required for
designing a new drug are immense and at an unacceptable level. According to some estimates
it costs about $880 million and 14 years of research to develop a new drug before it is
introduced in the market Intervention of computers at some plausible steps is imperative to
bring down the cost and time required in the drug discovery process.
Structure based drug design
The crystal structure of a ligand bound to a protein provides a detailed insight into the
interactions made between the protein and the ligand. Structure designed can be used to
identify where the ligand can be changed to modulate the physicochemical and ADME
properties of the compound, by showing which parts of the compound are important to
affinity and which parts can be altered without affecting the binding. The equilibrium
between target and ligand is governed by the free energy of the complex compared to the free
energy of the individual target and ligand. This includes not only the interaction between
target and ligand but also the salvation and entropy of the three different species and the
energy of the conformation of the free species.
What is quality control and what are the methods applied for it?
Quality control is a process that is used to ensure a certain level of quality in a product or service. It might
include whatever actions a business deems necessary to provide for the control and verification of certain
characteristics of a product or service. Most often, it involves thoroughly examining and testing the quality
of products or the results of services. The basic goal of this process is to ensure that the products or
services that are provided meet specific requirements and characteristics, such as being dependable,
satisfactory, safe and fiscally sound.
Companies that engage in quality control typically have a team of workers who focus on testing a certain
number of products or observing services being done. The products or services that are examined usually
are chosen at random. The goal of the quality control team is to identify products or services that do not
meet a company's specified standards of quality. If a problem is identified, the job of a quality control team
or professional might involve stopping production or service until the problem has been corrected.
Depending on the particular service or product as well as the type of problem identified, production or
services might not cease entirely.
In the pharmaceutical industry, drug dissolution testing is routinely used to provide critical in vitro drug release
information for both quality control purposes, i.e., to assess batch-to-batch consistency of solid oral dosage forms such
as tablets, and drug development, i.e., to predict in vivo drug release profiles.
In vitro drug dissolution data generated from dissolution testing experiments can be related to in
vivo pharmacokinetic data by means of in vitro-in vivo correlations (IVIVC). A well established predictive IVIVC
model can be very helpful for drug formulation design and post-approval manufacturing changes.
The main objective of developing and evaluating an IVIVC is to establish the dissolution test as a surrogate for
human bioequivalence studies, as stated by the Food and Drug Administration (FDA). Analytical data from drug
dissolution testing are sufficient in many cases to establish safety and efficacy of a drug product without in vivo tests,
following minor formulation and manufacturing changes (Qureshi and Shabnam, 2001). Thus, the dissolution testing
which is conducted in dissolution apparatus must be able to provide accurate and reproducible results.
Several dissolution apparatuses exist. In United States Pharmacopeia (USP) General Chapter <711> Dissolution, there
are four dissolution apparatuses standardized and specified.
• USP Dissolution Apparatus 1 - Basket (37°C)
• Dissolution Apparatus 2 - Paddle (37°C)
• USP Dissolution Apparatus 3 - Reciprocating Cylinder (37°C)
• USP Dissolution Apparatus 4 - Flow-Through Cell (37°C)
USP Dissolution Apparatus 2 is the most widely used apparatus among these four.
The performances of dissolution apparatuses are highly dependent on hydrodynamics due to the nature of dissolution
testing. The designs of the dissolution apparatuses and the ways of operating dissolution apparatuses have huge
impacts on the hydrodynamics, thus the performances. Hydrodynamic studies in dissolution apparatuses were carried
out by researchers over the past few years with both experimental methods and numerical modeling such
as Computational Fluid Dynamics (CFD). The main target was USP Dissolution Apparatus 2. The reason is that many
researchers suspect that USP Dissolution Apparatus 2 provides inconsistent and sometimes faulty data. The
hydrodynamic studies of USP Dissolution Apparatus 2 mentioned above clearly showed that it does have intrinsic
hydrodynamic issues which could result in problems. In 2005, Professor Piero Armenante from New Jersey Institute
of Technology (NJIT) and Professor Fernando Muzzio from University submitted a technical report to the FDA. In
this technical report, the intrinsic hydrodynamic issues with USP Dissolution Apparatus 2 based on the research
findings of Armenante's group and Muzzio's group were discussed.
More recently, hydrodynamic studies were conducted in USP Dissolution Apparatus 4.
Disintegration tests are performed as per the pharmacopoeial standards. Disintegration is a measure of the
quality of the oral dosage form like tablets and capsules. Each of the pharmacopoeia like the USP, BP, IP etc
each have their own set of standards and specify disintegration tests of their own. USP, European
pharmacopoeia and Japanese pharmacopoeia have been harmonised by the International conference on
Harmonisation (ICH) and are interchangeable. The disintegration test is performed to find out the time it takes
for a solid oral dosage form like a tablet or capsule to completely disintegrate. The time of disintegration is a
measure of the quality. This is because, for example, if the disintegration time is too high; it means that the
tablet is too highly compressed or the capsule shell gelatin is not of pharmacopoeial quality or it may imply
several other reasons. And also if the disintegration time is not uniform in a set of samples being analysed, it
indicates batch inconsistency and lack of batch uniformity.
The Swiss Pharmacopoeia, way back in 1935, required that a disintegration test should be performed on all
tablets and capsules as a criterion of its performance (1). Disintegration test was seen as a test for the uniformity
of the compressional characteristics. Optimisation of compression characteristics was done based on
disintegration test and the hardness test. Modern medicine era may be considered to be starting from 1937, and
from this year tablets became important (2). Tabletting technology was mostly empirical upto the year 1950. Till
this year, i.e., 1950, formulators depended on disintegration test, largely, to optimise their compression
characteristics. Drug release testing by way of dissolution testing was not much used to characterise the tablets,
probably because, by that time, convenient and sensitive chemical analyses were not available before this
period. The British Pharmacopoeia was the first, in 1945, to adopt an official disintegration test. Before 1950,
the test became official in USP also. Even at that time, it was recognised that disintegration does not ensure
good performance. USP-NF of that period says " disintegration does not imply complete solution of the tablet or
even of its active ingredient." In the year 1950, sporadic reports of tablet products of vitamins failing to release
their total drug content started appearing. It was only then that formulators realised that though the
tablets/capsules showed required disintegration time, they might show poor dissolution, which might effect its
clinical performance. Chapman et al., demonstrated that formulations with long disintegration times might not
show good bioavailability. Later, John Wagner demonstrated the relationship between poor performance of
some drug products in disintegration tests and their failure to release the drug during their gastrointestinal
transit. In the 1960s two separate developments occurred. One is the development of sensitive instrumental
methods of analysis and the other is the growth of a new generation of pharmaceutical scientists who started
applying the principles of physical chemistry to pharmacy. This development was attributed in USA to Takeru
Higuchi and his students. In the later period more pharmaceutical scientists like, Campagna, Nelson, and Levy
worked on this field and more and more instances of lack of correlation between disintegration time and
bioavailability surfaced. It was in the year 1970 that the first dissolution apparatus, the rotating basket was
designed and adopted in the USA. An excellent review on disintegration test was written by Wagner in 1971 (3).
Disintegration Test Apparatus
Coming to the test, the disintegration test is conducted using the disintegration apparatus. Although there are
slight variations in the different pharmacopoeias, the basic construction and the working of the apparatus
remains the same. The apparatus consists of a basket made of transparent polyvinyl or other plastic material. It
has 6 tubes set into the same basket with equal diameter and a wire mesh made of stainless steel with uniform
mesh size is fixed to each of these six tubes. Small metal discs may be used to enable immersion of the dosage
form completely. The entire basket-rack assembly is movable by reciprocating motor which is fixed to the apex
of the basket-rack assembly. The entire assembly is immersed in a vessel containing the medium in which the
disintegration test is to be carried out. The vessel is provided with a thermostat to regulate the temperature of the
fluid medium to the desired temperature.
Disintegration Test Method
The disintegration test for each dosage form is given in the pharmacopoeia. There are some general tests for
typical types of dosage forms. However, the disintegration test prescribed in the individual monograph of a
product is to be followed. If the monograph does not specify any specific test, the general test for the specific
dosage form may be employed. Some of the types of dosage forms and their disintegration tests are: 1.Uncoated
tablets- Tested using distilled water as medium at 37+/-2 C at 29-32 cycles per minute; test is completed after 15
minutes. It is acceptable when there is no palpable core at the end of the cycle (for at least 5 tablets or capsules)
and if the mass does not stick to the immersion disc. 2.Coated tablets- the same test procedure is adapted but the
time of operation is 30 minutes. 3.Enteric coated/ Gastric resistant tablets- the test is carried out first in distilled
water (at room temperature for 5 min.; USP and no distilled water per BP and IP), then it is tested in 0.1 M HCL
(upto 2 hours; BP) or Stimulated gastric fluid (1 hour; USP) followed by Phosphate buffer, pH 6.8 (1 hour; BP)
or Stimulated intestinal fluid without enzymes (1 hour; USP). 4.Chewable tablets- exempted from disintegration
test (BP and IP), 4 hours (USP). These are a few examples for illustration. The disintegration tests for capsules,
both hard and soft gelatin capsules are also performed in a similar manner. Also, the USP also provides
disintegration tests for suppositories, pessaries etc. http://www.youtube.com/watch?v=c_xkeSfZa9Y
Applications of Disintegration test :
1.Disintegration test is a simple test which helps in the preformulation stage to the formulator. 2.It helps in the
optimisation of manufacturing variables, such as compressional force and dwell time. 3.This test is also a simple
in-process control tool to ensure uniformity from batch to batch and among different tablets 4.It is also an
important test in the quality control of tablets and hard gelatine capsules.
Advantages of Disintegration tests:
This test is simple in concept and in practice. It is very useful in preformulation, optimisation and in quality
Disintegration test cannot be relied upon for the assurance of bioavailability.
HIGH PERFORMANCE LIQUID CHROMATOGRAPHY - HPLC
High performance liquid chromatography is a powerful tool in analysis. This page
looks at how it is carried out and shows how it uses the same principles as in thin
layer chromatography and column chromatography.
Carrying out HPLC
High performance liquid chromatography is basically a highly improved form of
column chromatography. Instead of a solvent being allowed to drip through a
column under gravity, it is forced through under high pressures of up to 400
atmospheres. That makes it much faster.
It also allows you to use a very much smaller particle size for the column packing
material which gives a much greater surface area for interactions between the
stationary phase and the molecules flowing past it. This allows a much better
separation of the components of the mixture. The other major improvement over
column chromatography concerns the detection methods which can be used.
These methods are highly automated and extremely sensitive.
The column and the solvent
Confusingly, there are two variants in use in HPLC depending on the relative
polarity of the solvent and the stationary phase.
Normal phase HPLC
This is essentially just the same as you will already have read about in thin layer
chromatography or column chromatography. Although it is described as "normal",
it isn't the most commonly used form of HPLC.
The column is filled with tiny silica particles, and the solvent is non-polar -
hexane, for example. A typical column has an internal diameter of 4.6 mm (and
may be less than that), and a length of 150 to 250 mm.
Polar compounds in the mixture being passed through the column will stick
longer to the polar silica than non-polar compounds will. The non-polar ones will
therefore pass more quickly through the column.
Reversed phase HPLC
In this case, the column size is the same, but the silica is modified to make it non-
polar by attaching long hydrocarbon chains to its surface - typically with either 8
or 18 carbon atoms in them. A polar solvent is used - for example, a mixture of
water and an alcohol such as methanol.
In this case, there will be a strong attraction between the polar solvent and polar
molecules in the mixture being passed through the column. There won't be as
much attraction between the hydrocarbon chains attached to the silica (the
stationary phase) and the polar molecules in the solution. Polar molecules in the
mixture will therefore spend most of their time moving with the solvent.
Non-polar compounds in the mixture will tend to form attractions with the
hydrocarbon groups because of van der Waals dispersion forces. They will also be
less soluble in the solvent because of the need to break hydrogen bonds as they
squeeze in between the water or methanol molecules, for example. They therefore
spend less time in solution in the solvent and this will slow them down on their
way through the column.
That means that now it is the polar molecules that will travel through the column
Reversed phase HPLC is the most commonly used form of HPLC.
Looking at the whole process
A flow scheme for HPLC
Injection of the sample
Injection of the sample is entirely automated, and you wouldn't be expected to
know how this is done at this introductory level. Because of the pressures
involved, it is not the same as in gas chromatography (if you have already studied
Retention time :The time taken for a particular compound to travel through the
column to the detector is known as its retention time. This time is measured from
the time at which the sample is injected to the point at which the display shows a
maximum peak height for that compound.
Different compounds have different retention times. For a particular compound,
the retention time will vary depending on:
the pressure used (because that affects the flow rate of the solvent)
the nature of the stationary phase (not only what material it is made of, but
also particle size)
the exact composition of the solvent
the temperature of the column
That means that conditions have to be carefully controlled if you are using
retention times as a way of identifying compounds.
There are several ways of detecting when a substance has passed through the
column. A common method which is easy to explain uses ultra-violet absorption.
Many organic compounds absorb UV light of various wavelengths. If you have a
beam of UV light shining through the stream of liquid coming out of the column,
and a UV detector on the opposite side of the stream, you can get a direct reading
of how much of the light is absorbed.
The amount of light absorbed will depend on the amount of a particular
compound that is passing through the beam at the time.
You might wonder why the solvents used don't absorb UV light. They do! But
different compounds absorb most strongly in different parts of the UV spectrum.
Methanol, for example, absorbs at wavelengths below 205 nm, and water below
190 nm. If you were using a methanol-water mixture as the solvent, you would
therefore have to use a wavelength greater than 205 nm to avoid false readings
from the solvent.
Interpreting the output from the detector
The output will be recorded as a series of peaks - each one representing a
compound in the mixture passing through the detector and absorbing UV light. As
long as you were careful to control the conditions on the column, you could use
the retention times to help to identify the compounds present - provided, of
course, that you (or somebody else) had already measured them for pure samples
of the various compounds under those identical conditions.
But you can also use the peaks as a way of measuring the quantities of the
compounds present. Let's suppose that you are interested in a particular
If you injected a solution containing a known amount of pure X into the machine,
not only could you record its retention time, but you could also relate the amount
of X to the peak that was formed.
The area under the peak is proportional to the amount of X which has passed the
detector, and this area can be calculated automatically by the computer linked to
the display. The area it would measure is shown in green in the (very simplified)
If the solution of X was less concentrated, the area under the peak would be less -
although the retention time will still be the same. For example:
This means that it is possible to calibrate the machine so that it can be used to find
how much of a substance is present - even in very small quantities.
Be careful, though! If you had two different substances in the mixture (X and Y)
could you say anything about their relative amounts? Not if you were using UV
absorption as your detection method.
In the diagram, the area under the peak for Y is less than that for X. That may be
because there is less Y than X, but it could equally well be because Y absorbs UV
light at the wavelength you are using less than X does. There might be large
quantities of Y present, but if it only absorbed weakly, it would only give a small
Coupling HPLC to a mass spectrometer
This is where it gets really clever! When the detector is showing a peak, some of
what is passing through the detector at that time can be diverted to a mass
spectrometer. There it will give a fragmentation pattern which can be compared
against a computer database of known patterns. That means that the identity of a
huge range of compounds can be found without having to know their retention
Introduction to Spectroscopy
In previous sections of this text the structural formulas of hundreds of organic compounds have
been reported, often with very little supporting evidence. These structures, and millions of others
described in the scientific literature, are in fact based upon sound experimental evidence, which
was omitted at the time in order to focus on other aspects of the subject. Much of the most
compelling evidence for structure comes from spectroscopic experiments, as will be
demonstrated in the following topics.
The Light of Knowledge is an often used phrase, but it is particularly appropriate in reference to
spectroscopy. Most of what we know about the structure of atoms and molecules comes from
studying their interaction with light (electromagnetic radiation). Different regions of
the electromagnetic spectrum provide different kinds of information as a result of such
interactions. Realizing that light may be considered to have both wave-like and particle-like
characteristics, it is useful to consider that a given frequency or wavelength of light is associated
with a "light quanta" of energy we now call a photon. As noted in the following equations,
frequency and energy change proportionally, but wavelength has an inverse relationship to these
In order to "see" a molecule, we must use light having a wavelength smaller than the molecule
itself (roughly 1 to 15 angstrom units). Such radiation is found in the X-ray region of the
spectrum, and the field of X-ray crystallography yields remarkably detailed pictures of molecular
structures amenable to examination. The chief limiting factor here is the need for high quality
crystals of the compound being studied. The methods of X-ray crystallography are too complex to
be described here; nevertheless, as automatic instrumentation and data handling techniques
improve, it will undoubtedly prove to be the procedure of choice for structure determination.
The spectroscopic techniques described below do not provide a three-dimensional picture of a
molecule, but instead yield information about certain characteristic features. A brief summary of
this information follows:
• Mass Spectrometry: Sample molecules are ionized by high energy electrons. The mass to
charge ratio of these ions is measured very accurately by electrostatic acceleration and
magnetic field perturbation, providing a precise molecular weight. Ion fragmentation patterns
may be related to the structure of the molecular ion.
• Ultraviolet-Visible Spectroscopy: Absorption of this relatively high-energy light causes
electronic excitation. The easily accessible part of this region (wavelengths of 200 to 800 nm)
shows absorption only if conjugated pi-electron systems are present.
• Infrared Spectroscopy: Absorption of this lower energy radiation causes vibrational and
rotational excitation of groups of atoms. within the molecule. Because of their characteristic
absorptionsidentification of functional groups is easily accomplished.
• Nuclear Magnetic Resonance Spectroscopy: Absorption in the low-energy radio-
frequency part of the spectrum causes excitation of nuclear spin states. NMR spectrometers
are tuned to certain nuclei (e.g. 1
F & 31
P). For a given type of nucleus, high-resolution
spectroscopy distinguishes and counts atoms in different locations in the molecule.
Visible and Ultraviolet Spectroscopy
An obvious difference between certain
compounds is their color. Thus,
quinone is yellow; chlorophyll is green;
derivatives of aldehydes and ketones
range in color from bright yellow to deep red, depending on double bond conjugation; and
aspirin is colorless. In this respect the human eye is functioning as a spectrometer analyzing
the light reflected from the surface of a solid or passing through a liquid. Although we see
sunlight (or white light) as uniform or homogeneous in color, it is actually composed of a
broad range of radiation wavelengths in the ultraviolet (UV), visible and infrared (IR)
portions of the spectrum. As shown on the right, the component colors of the visible portion
can be separated by passing sunlight through a prism, which acts to bend the light in differing
degrees according to wavelength. Electromagnetic radiation such as visible light is commonly
treated as a wave phenomenon, characterized by a wavelength or frequency. Wavelength is
defined on the left below, as the distance between adjacent peaks (or troughs), and may be
designated in meters, centimeters or nanometers (10-9
meters). Frequency is the number of
wave cycles that travel past a fixed point per unit of time, and is usually given in cycles per
second, or hertz (Hz). Visible wavelengths cover a range from approximately 400 to 800 nm.
The longest visible wavelength is red and the shortest is violet. Other common colors of the
spectrum, in order of decreasing wavelength, may be remembered by the mnemonic: ROY G
BIV. The wavelengths of what we perceive as particular colors in the visible portion of the
spectrum are displayed and listed below. In horizontal diagrams, such as the one on the
bottom left, wavelength will increase on moving from left to right.
Violet: 400 - 420 nm
Indigo: 420 - 440 nm
Blue: 440 - 490 nm
Green: 490 - 570 nm
Yellow: 570 - 585 nm
Orange: 585 - 620 nm
Red: 620 - 780 nm
When white light passes through or is reflected by a colored substance, a characteristic
portion of the mixed wavelengths is absorbed. The
remaining light will then assume the complementary
color to the wavelength(s) absorbed. This relationship
is demonstrated by the color wheel shown on the
right. Here, complementary colors are diametrically
opposite each other. Thus, absorption of 420-430 nm
light renders a substance yellow, and absorption of
500-520 nm light makes it red. Green is unique in
that it can be created by absoption close to 400 nm as
well as absorption near 800 nm.
Early humans valued colored pigments, and used
them for decorative purposes. Many of these were inorganic minerals, but several important
organic dyes were also known. These included the crimson pigment, kermesic acid, the blue
dye, indigo, and the yellow saffron pigment, crocetin. A rare dibromo-indigo derivative,
punicin, was used to color the robes of the royal and wealthy. The deep orange hydrocarbon
carotene is widely distributed in plants, but is not sufficiently stable to be used as permanent
pigment, other than for food coloring. A common feature of all these colored compounds,
displayed below, is a system of extensively conjugated pi-electrons.
2. The Electromagnetic Spectrum
The visible spectrum constitutes but a small part of the total radiation spectrum. Most of the
radiation that surrounds us cannot be seen, but can be detected by dedicated sensing
instruments. Thiselectromagnetic spectrum ranges from very short wavelengths (including
gamma and x-rays) to very long wavelengths (including microwaves and broadcast radio
waves). The following chart displays many of the important regions of this spectrum, and
demonstrates the inverse relationship between wavelength and frequency (shown in the top
equation below the chart).
The energy associated with a given segment of the spectrum is proportional to its frequency.
The bottom equation describes this relationship, which provides the energy carried by a
photon of a given wavelength of radiation.
To obtain specific frequency, wavelength and energy values use this calculator.
3. UV-Visible Absorption Spectra
To understand why some compounds are colored and others are not, and to determine the
relationship of conjugation to color, we must make accurate measurements of light absorption
at different wavelengths in and near the visible part of the spectrum. Commercial optical
spectrometers enable such experiments to be conducted with ease, and usually survey both
the near ultraviolet and visible portions of the spectrum.
The visible region of the spectrum comprises photon energies of 36 to 72 kcal/mole, and the
near ultraviolet region, out to 200 nm, extends this energy range to 143 kcal/mole. Ultraviolet
radiation having wavelengths less than 200 nm is difficult to handle, and is seldom used as a
routine tool for structural analysis.
The energies noted
above are sufficient to
promote or excite a
molecular electron to
a higher energy
out in this region is
diagram showing the
various kinds of electronic excitation that may occur in organic molecules is shown on the
left. Of the six transitions outlined, only the two lowest energy ones (left-most, colored blue)
are achieved by the energies available in the 200 to 800 nm spectrum. As a rule, energetically
favored electron promotion will be from the highest occupied molecular orbital
(HOMO) to the lowest unoccupied molecular orbital (LUMO), and the resulting species is
called an excited state. For a review of molecular orbitals .
When sample molecules are exposed to light having an energy that matches a possible
electronic transition within the molecule, some of the light energy will be absorbed as the
electron is promoted to a higher energy orbital. An optical spectrometer records the
wavelengths at which absorption occurs, together with the degree of absorption at each
wavelength. The resulting spectrum is presented as a graph of absorbance (A) versus
wavelength, as in the isoprene spectrum shown below. Since isoprene is colorless, it does not
absorb in the visible part of the spectrum and this region is not displayed on the
graph. Absorbance usually ranges from 0 (no absorption) to 2 (99% absorption), and is
precisely defined in context with spectrometer operation.
Because the absorbance of a sample will be proportional to the number of absorbing
molecules in the spectrometer light beam (e.g. their molar concentration in the sample tube),
it is necessary to correct the absorbance value for this and other operational factors if the
spectra of different compounds are to be compared in a meaningful way. The corrected
absorption value is called "molar absorptivity", and is particularly useful when comparing the
spectra of different compounds and determining the relative strength of light absorbing
functions (chromophores). Molar absorptivity (ε) is defined as:
= A / c l
(where A= absorbance, c = sample concentration in moles/liter & l =
length of light path through the sample in cm.)
If the isoprene spectrum on the right was obtained from a dilute hexane solution (c = 4 * 10-
moles per liter) in a 1 cm sample cuvette, a simple calculation using the above formula
indicates a molar absorptivity of 20,000 at the maximum absorption wavelength. Indeed the
entire vertical absorbance scale may be changed to a molar absorptivity scale once this
information about the sample is in hand. Clicking on the spectrum will display this change in
Chromophore Example Excitation λmax, nm ε Solvent
C=C Ethene π __
> π* 171 15,000 hexane
C≡C 1-Hexyne π __
> π* 180 10,000 hexane
C=O Ethanal n __
N=O Nitromethane n __
From the chart above it should be clear that the only molecular moieties likely to absorb light
in the 200 to 800 nm region are pi-electron functions and hetero atoms having non-bonding
valence-shell electron pairs. Such light absorbing groups are referred to as chromophores. A
list of some simple chromophores and their light absorption characteristics is provided on the
left above. The oxygen non-bonding electrons in alcohols and ethers do not give rise to
absorption above 160 nm. Consequently, pure alcohol and ether solvents may be used for
The presence of chromophores in a molecule is best documented by UV-Visible
spectroscopy, but the failure of most instruments to provide absorption data for wavelengths
below 200 nm makes the detection of isolated chromophores problematic. Fortunately,
conjugation generally moves the absorption maxima to longer wavelengths, as in the case of
isoprene, so conjugation becomes the major structural feature identified by this technique.
Molar absorptivities may be very large for strongly absorbing chromophores (>10,000) and
very small if absorption is weak (10 to 100). The magnitude ofε reflects both the size of the
chromophore and the probability that light of a given wavelength will be absorbed when it
strikes the chromophore.
For further discussion of this topic Click Here.
4. The Importance of Conjugation
A comparison of the absorption spectrum of 1-pentene, λmax = 178 nm, with that of isoprene
(above) clearly demonstrates the importance of chromophore conjugation. Further evidence
of this effect is shown below. The spectrum on the left illustrates that conjugation of double
and triple bonds also shifts the absorption maximum to longer wavelengths. From the polyene
spectra displayed in the center diagram, it is clear that each additional double bond in the
conjugated pi-electron system shifts the absorption maximum about 30 nm in the same
direction. Also, the molar absorptivity (ε) roughly doubles with each new conjugated double
bond. Spectroscopists use the terms defined in the table on the right when describing shifts in
absorption. Thus, extending conjugation generally results in bathochromic and hyperchromic
shifts in absorption.
The appearance of several absorption peaks or shoulders for a given chromophore is common
for highly conjugated systems, and is often solvent dependent. This fine structure reflects not
only the different conformations such systems may assume, but also electronic transitions
between the different vibrational energy levels possible for each electronic state. Vibrational
fine structure of this kind is most pronounced in vapor phase spectra, and is increasingly
broadened and obscured in solution as the solvent is changed from hexane to methanol.
Terminology for Absorption
To understand why conjugation should cause bathochromic shifts in the absorption maxima
of chromophores, we need to look at the relative energy levels of the pi-orbitals. When two
double bonds are conjugated, the four p-atomic orbitals combine to generate four pi-
molecular orbitals (two are bonding and two are antibonding). This was described earlier in
the section concerning diene chemistry. In a similar manner, the three double bonds of a
conjugated triene create six pi-molecular orbitals, half bonding and half antibonding. The
energetically most favorable π __
> π* excitation occurs from the highest energy bonding pi-
orbital (HOMO) to the lowest energy antibonding pi-orbital (LUMO).
The following diagram illustrates this excitation for an isolated double bond (only two pi-
orbitals) and, on clicking the diagram, for a conjugated diene and triene. In each case the
HOMO is colored blue and the LUMO is colored magenta. Increased conjugation brings the
HOMO and LUMO orbitals closer together. The energy (ΔE) required to effect the electron
promotion is therefore less, and the wavelength that provides this energy is increased
correspondingly (remember λ = h • c/ΔE ).
Examples of π __
> π* Excitation
Many other kinds of conjugated pi-electron systems act as chromophores and absorb light in
the 200 to 800 nm region. These include unsaturated aldehydes and ketones and aromatic ring
compounds. A few examples are displayed below. The spectrum of the unsaturated ketone
(on the left) illustrates the advantage of a logarithmic display of molar absorptivity. The
> π* absorption located at 242 nm is very strong, with an ε = 18,000. The weak n __
absorption near 300 nm has an ε = 100.
Benzene exhibits very strong light absorption near 180 nm (ε > 65,000) , weaker absorption
at 200 nm (ε = 8,000) and a group of much weaker bands at 254 nm (ε = 240). Only the last
group of absorptions are completely displayed because of the 200 nm cut-off characteristic of
most spectrophotometers. The added conjugation in naphthalene, anthracene and tetracene
causes bathochromic shifts of these absorption bands, as displayed in the chart on the left
below. All the absorptions do not shift by the same amount, so for anthracene (green shaded
box) and tetracene (blue shaded box) the weak absorption is obscured by stronger bands that
have experienced a greater red shift. As might be expected from their spectra, naphthalene
and anthracene are colorless, but tetracene is orange.
The spectrum of the bicyclic diene (above right) shows some vibrational fine structure, but in
general is similar in appearance to that of isoprene, shown above. Closer inspection discloses
that the absorption maximum of the more highly substituted diene has moved to a longer
wavelength by about 15 nm. This "substituent effect" is general for dienes and trienes, and is
even more pronounced for enone chromophores.
As noted in a previous chapter, the light our eyes see is but a small part of a broad spectrum
of electromagnetic radiation. On the immediate high energy side of the visible spectrum lies
the ultraviolet, and on the low energy side is the infrared. The portion of the infrared region
most useful for analysis of organic compounds is not immediately adjacent to the visible
spectrum, but is that having a wavelength range from 2,500 to 16,000 nm, with a
corresponding frequency range from 1.9*1013
Photon energies associated with this part of the infrared (from 1 to 15 kcal/mole) are not large
enough to excite electrons, but may induce vibrational excitation of covalently bonded atoms
and groups. The covalent bonds in molecules are not rigid sticks or rods, such as found in
molecular model kits, but are more like stiff springs that can be stretched and bent. The
mobile nature of organic molecules was noted in the chapter concerning conformational
isomers. We must now recognize that, in addition to the facile rotation of groups about single
bonds, molecules experience a wide variety of vibrational motions, characteristic of their
component atoms. Consequently, virtually all organic compounds will absorb infrared
radiation that corresponds in energy to these vibrations. Infrared spectrometers, similar in
principle to the UV-Visible spectrometer described elsewhere, permit chemists to obtain
absorption spectra of compounds that are a unique reflection of their molecular structure. An
example of such a spectrum is that of the flavoring agent vanillin, shown below.
The complexity of this spectrum is typical of most infrared spectra, and illustrates their use in
identifying substances. The gap in the spectrum between 700 & 800 cm-1
is due to solvent
(CCl4) absorption. Further analysis (below) will show that this spectrum also indicates the
presence of an aldehyde function, a phenolic hydroxyl and a substituted benzene ring. The
inverted display of absorption, compared with UV-Visible spectra, is characteristic. Thus a
sample that did not absorb at all would record a horizontal line at 100% transmittance (top of
Frequency - Wavelength Converter
Frequency in cm-1
The frequency scale at the bottom of the chart is
given in units of reciprocal centimeters (cm-
) rather than Hz, because the numbers are more
manageable. The reciprocal centimeter is the
number of wave cycles in one centimeter;
whereas, frequency in cycles per second or Hz is
equal to the number of wave cycles in 3*1010
cm (the distance covered by light in one
second). Wavelength units are in micrometers,microns (μ), instead of nanometers for the
same reason. Most infrared spectra are displayed on a linear frequency scale, as shown here,
but in some older texts a linear wavelength scale is used. A calculator for interconverting
these frequency and wavelength values is provided on the right. Simply enter the value to be
converted in the appropriate box, press "Calculate" and the equivalent number will appear in
the empty box.
Infrared spectra may be obtained from samples in all phases (liquid, solid and gaseous).
Liquids are usually examined as a thin film sandwiched between two polished salt plates
(note that glass absorbs infrared radiation, whereas NaCl is transparent). If solvents are used
to dissolve solids, care must be taken to avoid obscuring important spectral regions by
solvent absorption. Perchlorinated solvents such as carbon tetrachloride, chloroform and
tetrachloroethene are commonly used. Alternatively, solids may either be incorporated in a
thin KBr disk, prepared under high pressure, or mixed with a little non-volatile liquid and
ground to a paste (or mull) that is smeared between salt plates.
2. Vibrational Spectroscopy
A molecule composed of n-atoms has 3n degrees of freedom, six of which are translations
and rotations of the molecule itself. This leaves 3n-6 degrees of vibrational freedom (3n-5 if
the molecule is linear). Vibrational modes are often given descriptive names, such as
stretching, bending, scissoring, rocking and twisting. The four-atom molecule of
formaldehyde, the gas phase spectrum of which is shown below, provides an example of
these terms. If a ball & stick model of formaldehyde is not displayed to the right of the
spectrum, press the view ball&stick model button on the right. We expect six fundamental
vibrations (12 minus 6), and these have been assigned to the spectrum absorptions. To see the
formaldehyde molecule display a vibration, click one of the buttons under the spectrum, or
click on one of the absorption peaks in the spectrum.
Wavelength in μ
Gas Phase Infrared Spectrum of Formaldehyde, H2C=O
View C=O Stretch
The exact frequency at which a given vibration occurs is determined by the strengths of the
bonds involved and the mass of the component atoms. For a more detailed discussion of these
factors Click Here. In practice, infrared spectra do not normally display separate absorption
signals for each of the 3n-6 fundamental vibrational modes of a molecule. The number of
observed absorptions may be increased by additive and subtractive interactions leading to
combination tones and overtones of the fundamental vibrations, in much the same way that
sound vibrations from a musical instrument interact. Furthermore, the number of observed
absorptions may be decreased by molecular symmetry, spectrometer limitations, and
spectroscopic selection rules. One selection rule that influences the intensity of infrared
absorptions, is that a change in dipole moment should occur for a vibration to absorb infrared
energy. Absorption bands associated with C=O bond stretching are usually very strong
because a large change in the dipole takes place in that mode.
Some General Trends:
i) Stretching frequencies are higher than corresponding bending frequencies. (It is
easier to bend a bond than to stretch or compress it.)
ii) Bonds to hydrogen have higher stretching frequencies than those to heavier
iii) Triple bonds have higher stretching frequencies than corresponding double
bonds, which in turn have higher frequencies than single bonds.
(Except for bonds to hydrogen).
The general regions of the infrared spectrum in which various kinds of vibrational bands are
observed are outlined in the following chart. Note that the blue colored sections above the
dashed line refer to stretching vibrations, and the green colored band below the line
encompasses bending vibrations. The complexity of infrared spectra in the 1450 to 600 cm-
region makes it difficult to assign all the absorption bands, and because of the unique
patterns found there, it is often called the fingerprint region. Absorption bands in the 4000 to
region are usually due to stretching vibrations of diatomic units, and this is
sometimes called the group frequency region.
3. Group Frequencies
Detailed information about the infrared absorptions observed for various bonded atoms and
groups is usually presented in tabular form. The following table provides a collection of such
data for the most common functional groups. Following the color scheme of the chart,
stretching absorptions are listed in the blue-shaded section and bending absorptions in the
green shaded part. More detailed descriptions for certain groups (e.g. alkenes, arenes,
alcohols, amines & carbonyl compounds) may be viewed by clicking on the functional
class name. Since most organic compounds have C-H bonds, a useful rule is that absorption
in the 2850 to 3000 cm-1
is due to sp3
C-H stretching; whereas, absorption above 3000 cm-1
C-H stretching or sp C-H stretching if it is near 3300 cm-1
Typical Infrared Absorption Frequencies
Stretching Vibrations Bending Vibrations
Alkanes 2850-3000 str CH3, CH2 & CH
2 or 3 bands
725 CH2 rocking
1600 & 1500
C-H (may be
C=C (in ring) (2
(3 if conjugated)
O-H (free), usually
Amines 3400-3500 (dil.
NH2 & N-H
(shifts on H-
C-H (aldehyde C-
str aryl ketone
Acids & Deriva
O-H (very broad)
O-C (sometimes 2-
C=O (amide I
To illustrate the usefulness of infrared absorption spectra, examples for five C4H8O isomers
are presented below their corresponding structural formulas. The five spectra may be
examined in turn by clicking the "Toggle Spectra" button. Try to associate each spectrum (A
- E) with one of the isomers in the row above it. When you have made assignments check
your answers by clicking on the structure or name of each isomer.
4. Other Functional Groups
Infrared absorption data for some functional groups not listed in the preceding table are given
below. Most of the absorptions cited are associated with stretching vibrations. Standard
abbreviations (str = strong, wk = weak, brd = broad & shp = sharp) are used to describe the
Functional Class Characteristic Absorptions
S-H thiols 2550-2600 cm-1
(wk & shp)
S-OR esters 700-900 (str)
S-S disulfide 500-540 (wk)
C=S thiocarbonyl 1050-1200 (str)
1325± 25 (as) & 1140± 20 (s) (both str)
1550± 50 (str)
1530± 20 (as) & 1350± 30 (s)
Nuclear Magnetic Resonance Spectroscopy
Over the past fifty years nuclear magnetic resonance spectroscopy, commonly referred to as
nmr, has become the preeminent technique for determining the structure of organic
compounds. Of all the spectroscopic methods, it is the only one for which a complete analysis
and interpretation of the entire spectrum is normally expected. Although larger amounts of
sample are needed than for mass spectroscopy, nmr is non-destructive, and with modern
instruments good data may be obtained from samples weighing less than a milligram. To be
successful in using nmr as an analytical tool, it is necessary to understand the physical
principles on which the methods are based.
The nuclei of many elemental isotopes have a characteristic spin (I). Some nuclei have
integral spins (e.g. I = 1, 2, 3 ....), some have fractional spins (e.g. I = 1/2, 3/2, 5/2 ....), and a
few have no spin, I = 0 (e.g. 12
S, ....). Isotopes of particular interest and use to
organic chemists are 1
F and 31
P, all of which have I = 1/2. Since the analysis of this
spin state is fairly straightforward, our discussion of nmr will be limited to these and other I =
The following features lead to the nmr phenomenon:
1. A spinning charge generates a magnetic field, as shown by the animation on
The resulting spin-magnet has a magnetic moment (μ) proportional to the spin.
2. In the presence of an external magnetic field (B0), two spin states
exist, +1/2 and-1/2.
The magnetic moment of the lower energy +1/2 state is aligned with the external
field, but that of the higher energy -1/2 spin state is opposed to the external field.
Note that the arrow representing the external field points North.
3. The difference in energy between the two spin states is dependent on the external magnetic field strength, and is
always very small. The following diagram illustrates that the two spin states have the same energy when the
external field is zero, but diverge as the field increases. At a field equal to Bx a formula for the energy difference is
given (remember I = 1/2 and μ is the magnetic moment of the nucleus in the field).
Strong magnetic fields are necessary for nmr spectroscopy. The international unit for magnetic flux is the tesla (T).
The earth's magnetic field is not constant, but is approximately 10-4
T at ground level. Modern nmr spectrometers
use powerful magnets having fields of 1 to 20 T. Even with these high fields, the energy difference between the two
spin states is less than 0.1 cal/mole. To put this in perspective, recall that infrared transitions involve 1 to 10
kcal/mole and electronic transitions are nearly 100 time greater.
For nmr purposes, this small energy difference (ΔE) is usually given as a frequency in units of MHz (106
ranging from 20 to 900 Mz, depending on the magnetic field strength and the specific nucleus being studied.
Irradiation of a sample with radio frequency (rf) energy corresponding exactly to the spin state separation of a
specific set of nuclei will cause excitation of those nuclei in the +1/2 state to the higher -1/2 spin state. Note that
this electromagnetic radiation falls in the radio and television broadcast spectrum. Nmr spectroscopy is therefore
the energetically mildest probe used to examine the structure of molecules.
The nucleus of a hydrogen atom (the proton) has a magnetic moment μ = 2.7927, and has been studied more than
any other nucleus. The previous diagram may be changed to display energy differences for the proton spin states (as
frequencies) by mouse clicking anywhere within it.
4. For spin 1/2 nuclei the energy difference between the two spin states at a given magnetic field strength will be
proportional to their magnetic moments. For the four common nuclei noted above, the magnetic moments are: 1
= 2.7927, 19
F μ = 2.6273, 31
P μ = 1.1305 & 13
C μ = 0.7022. These moments are in nuclear magnetons, which are
. The following diagram gives the approximate frequencies that correspond to the spin state
energy separations for each of these nuclei in an external magnetic field of 2.35 T. The formula in the colored box
shows the direct correlation of frequency (energy difference) with magnetic moment (h = Planck's constant =
2. Proton NMR Spectroscopy
This important and well-established application of nuclear magnetic resonance will serve to
illustrate some of the novel aspects of this method. To begin with, the nmr spectrometer must
be tuned to a specific nucleus, in this case the proton. The actual procedure for obtaining the
spectrum varies, but the simplest is referred to as the continuous wave (CW) method. A
typical CW-spectrometer is shown in the following diagram. A solution of the sample in a
uniform 5 mm glass tube is oriented between the poles of a powerful magnet, and is spun to
average any magnetic field variations, as well as tube imperfections. Radio frequency
radiation of appropriate energy is broadcast into the sample from an antenna coil (colored
red). A receiver coil surrounds the sample tube, and emission of absorbed rf energy is
monitored by dedicated electronic devices and a computer. An nmr spectrum is acquired by
varying or sweeping the magnetic field over a small range while observing the rf signal from
the sample. An equally effective technique is to vary the frequency of the rf radiation while
holding the external field constant.
As an example, consider a sample of water in a 2.3487 T external magnetic field, irradiated
by 100 MHz radiation. If the magnetic field is smoothly increased to 2.3488 T, the hydrogen
nuclei of the water molecules will at some point absorb rf energy and a resonance signal will
appear. An animation showing this may be activated by clicking the Show Field
Sweep button. The field sweep will be repeated three times, and the resulting resonance trace
is colored red. For visibility, the water proton signal displayed in the animation is much
broader than it would be in an actual experiment.
Since protons all have the same magnetic moment, we might expect all hydrogen atoms to
give resonance signals at the same field / frequency values. Fortunately for chemistry
applications, this is not true. By clicking the Show Different Protons button under the
diagram, a number of representative proton signals will be displayed over the same magnetic
field range. It is not possible, of course, to examine isolated protons in the spectrometer
described above; but from independent measurement and calculation it has been determined
that a naked proton would resonate at a lower field strength than the nuclei of covalently
bonded hydrogens. With the exception of water, chloroform and sulfuric acid, which are
examined as liquids, all the other compounds are measured as
Why should the proton nuclei in different compounds behave
differently in the nmr experiment ?
The answer to this question lies with the electron(s) surrounding
the proton in covalent compounds and ions. Since electrons are
charged particles, they move in response to the external magnetic
field (Bo) so as to generate a secondary field that opposes the much
stronger applied field. This secondary field shields the nucleus
from the applied field, so Bomust be increased in order to achieve
resonance (absorption of rf energy). As illustrated in the drawing
on the right, Bo must be increased to compensate for the induced
shielding field. In the upper diagram, those compounds that give resonance signals at the
higher field side of the diagram (CH4, HCl, HBr and HI) have proton nuclei that are more
shielded than those on the lower field (left) side of the diagram.
The magnetic field range displayed in the above diagram is very small compared with the
actual field strength (only about 0.0042%). It is customary to refer to small increments such
as this in units of parts per million (ppm). The difference between 2.3487 T and 2.3488 T is
therefore about 42 ppm. Instead of designating a range of nmr signals in terms of magnetic
field differences (as above), it is more common to use a frequency scale, even though the
spectrometer may operate by sweeping the magnetic field. Using this terminology, we would
find that at 2.34 T the proton signals shown above extend over a 4,200 Hz range (for a 100
MHz rf frequency, 42 ppm is 4,200 Hz). Most organic compounds exhibit proton resonances
that fall within a 12 ppm range (the shaded area), and it is therefore necessary to use very
sensitive and precise spectrometers to resolve structurally distinct sets of hydrogen atoms
within this narrow range. In this respect it might be noted that the detection of a part-per-
million difference is equivalent to detecting a 1 millimeter difference in distances of 1
Unlike infrared and uv-visible spectroscopy, where absorption peaks are uniquely located by
a frequency or wavelength, the location of different nmr resonance signals is dependent on
both the external magnetic field strength and the rf frequency. Since no two magnets will
have exactly the same field, resonance frequencies will vary accordingly and an alternative
method for characterizing and specifying the location of nmr signals is needed. This problem
is illustrated by the eleven different compounds shown in the following diagram. Although
the eleven resonance signals are distinct and well separated, an unambiguous numerical
locator cannot be directly assigned to each.
One method of solving this problem is to report the location of an nmr signal in a spectrum
relative to a reference signal from a standard compound added to the sample. Such a
reference standard should be chemically unreactive, and easily removed from the sample
after the measurement. Also, it should give a single sharp nmr signal that does not interfere
with the resonances normally observed for organic compounds. Tetramethylsilane,
(CH3)4Si, usually referred to as TMS, meets all these characteristics, and has become the
reference compound of choice for proton and carbon nmr.
Since the separation (or dispersion) of nmr signals is magnetic field dependent, one additional
step must be taken in order to provide an unambiguous location unit. This is illustrated for the
acetone, methylene chloride and benzene signals by clicking on the previous diagram. To
correct these frequency differences for their field dependence, we divide them by the
spectrometer frequency (100 or 500 MHz in the example), as shown in a new display by
again clicking on the diagram. The resulting number would be very small, since we are
dividing Hz by MHz, so it is multiplied by a million, as shown by the formula in the blue
shaded box. Note that νref is the resonant frequency of the reference signal and νsamp is the
frequency of the sample signal. This operation gives a locator number called the Chemical
Shift, having units of parts-per-million (ppm), and designated by the symbol δ Chemical
shifts for all the compounds in the original display will be presented by a third click on the
The compounds referred to above share two common characteristics:
• The hydrogen atoms in a given molecule are all structurally equivalent, averaged for
fast conformational equilibria.
• The compounds are all liquids, save for neopentane which boils at 9 °C and is a
liquid in an ice bath.
The first feature assures that each compound gives a single sharp resonance signal. The
second allows the pure (neat) substance to be poured into a sample tube and examined in a
nmr spectrometer. In order to take the nmr spectra of a solid, it is usually necessary to
dissolve it in a suitable solvent. Early studies used carbon tetrachloride for this purpose, since
it has no hydrogen that could introduce an interfering signal. Unfortunately, CCl4 is a poor
solvent for many polar compounds and is also toxic. Deuterium labeled compounds, such as
deuterium oxide (D2O), chloroform-d (DCCl3), benzene-d6(C6D6), acetone-d6 (CD3COCD3)
and DMSO-d6 (CD3SOCD3) are now widely used as nmr solvents. Since the deuterium
isotope of hydrogen has a different magnetic moment and spin, it is invisible in a
spectrometer tuned to protons.
From the previous discussion and examples we may deduce that one factor contributing to
chemical shift differences in proton resonance is the inductive effect. If the electron density
about a proton nucleus is relatively high, the induced field due to electron motions will be
stronger than if the electron density is relatively low. The shielding effect in such high
electron density cases will therefore be larger, and a higher external field (Bo) will be needed
for the rf energy to excite the nuclear spin. Since silicon is less electronegative than carbon,
the electron density about the methyl hydrogens in tetramethylsilane is expected to be greater
than the electron density about the methyl hydrogens in neopentane (2,2-dimethylpropane),
and the characteristic resonance signal from the silane derivative does indeed lie at a higher
magnetic field. Such nuclei are said to be shielded. Elements that are more electronegative
than carbon should exert an opposite effect (reduce the electron density); and, as the data in
the following tables show, methyl groups bonded to such elements display lower field signals
(they are deshielded). The deshielding effect of electron withdrawing groups is roughly
proportional to their electronegativity, as shown by the left table. Furthermore, if more than
one such group is present, the deshielding is additive (table on the right), and proton
resonance is shifted even further downfield.
Proton Chemical Shifts of Methyl Derivatives
Compound (CH3)4C (CH3)3N (CH3)2O CH3F
0.9 2.1 3.2 4.1
Compound (CH3)4Si (CH3)3P (CH3)2S CH3Cl
0.0 0.9 2.1 3.0
Proton Chemical Shifts (ppm)
3.0 2.7 2.1 3.1 2.1
5.3 5.0 3.9 4.4 3.7
7.3 6.8 4.9 5.0
The general distribution of proton chemical shifts associated with different functional groups
is summarized in the following chart. Bear in mind that these ranges are approximate, and
may not encompass all compounds of a given class. Note also that the ranges specified for
OH and NH protons (colored orange) are wider than those for most CH protons. This is due
to hydrogen bonding variations at different sample concentrations.
Proton Chemical Shift Ranges*
* For samples in CDCl3 solution. The δ scale is relative to TMS at δ = 0.
To make use of a calculator that predicts aliphatic proton chemical shifts Click Here. This
application was developed at Colby College.
The magnitude or intensity of nmr resonance signals is displayed along the vertical axis of a
spectrum, and is proportional to the molar concentration of the sample. Thus, a small or dilute
sample will give a weak signal, and doubling or tripling the sample concentration increases
the signal strength proportionally. If we take the nmr spectrum of equal molar amounts of
benzene and cyclohexane in carbon tetrachloride solution, the resonance signal from
cyclohexane will be twice as intense as that from benzene because cyclohexane has twice as
many hydrogens per molecule. This is an important relationship when samples incorporating
two or more different sets of hydrogen atoms are examined, since it allows the ratio of
hydrogen atoms in each distinct set to be determined. To this end it is necessary to measure
the relative strength as well as the chemical shift of the resonance signals that comprise an
nmr spectrum. Two common methods of displaying the integrated intensities associated with
a spectrum are illustrated by the following examples. In the three spectra in the top row, a
horizontal integrator trace (light green) rises as it crosses each signal by a distance
proportional to the signal strength. Alternatively, an arbitrary number, selected by the
instrument's computer to reflect the signal strength, is printed below each resonance peak, as
shown in the three spectra in the lower row. From the relative intensities shown here, together
with the previously noted chemical shift correlations, the reader should be able to assign the
signals in these spectra to the set of hydrogens that generates each.If you click on one of the