Protein_Basics.pptx

C Manikandan
Protein Basics
- Forces and Software Tools

Hydrophobic and Hydrophilic amino acids
Based on Hydropathy Index
• Hydrophobic (Water phobic)- Glycine(G),
Alanine(A), Isoleucine(I), Leucine(L),
Valine(V), Proline(P)- Imino,
Phenylalanine(F) - These amino acids can
be found in the interior of the proteins -
Aliphatic+1 Aromatic(F)+1 Imino acid(P)
• Hydrophilic(Water loving-exterior)-
{Serine(S), Threonine(T)- OH}, {Aspartic
Acid(D), Glutamic Acid(E)- Acidic},
Cysteine(C)-Sulfur containing,
{Asparagine(N), Glutamine(Q)- Amide},
{Arginine, Histidine -basic groups}
• Amphipathic(both)- Lysine(K),
{Tryptophan(W), Tyrosine(Y)-Aromatic},
Methionine- Sulfur containing Image courtesy: Researchgate

Amphipathic Amino acids
What is beneficial about them?
• Amphipathic amino acids could readily tolerate the change of environment
from hydrophilic to hydrophobic that occurs upon antibody-antigen complex
formation. Residues that are large and can participate in a wide variety of van
der Waals' and electrostatic interactions would permit binding to a range of
antigens. Amino acids with flexible side-chains could generate a structurally
plastic region, i.e. a binding site possessing the ability to mould itself around
the antigen to improve complementarity of the interacting surfaces. Hence,
antibodies could bind to an array of novel antigens using a limited set of
residues interspersed with more unique residues to which greater binding
specificity can be attributed.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2829252/

Amino Acids are based on…?
Amine group(NH2), Carboxyl group(COOH) attached to the C-alpha
• Simplest Amino Acid - Glycine shown below(2D &3D). Blue Atom- Nitrogen,
Red Atom- Oxygen, Grey - Carbon. Usually found in NH3+, COO-.
• Ampholytes are molecules that contain both acidic and basic groups (and are
therefore amphoteric) and will exist as zwitterions(having both positive and
negative charges) at a certain pH.

Ampholytes and Amphoteric
What is the difference?
• The term amphoteric means the ability of a molecule, ion or any other complex compound to act
both as a base and an acid. If proton donation and acceptance definition of acid and base are
used, then, amphiprotic is another term. Ambi as in ambidextrous(ability to use both hands
equally well) and Amphi prefixes means both ways like in ambhibian.
• Ampholytes are substances that contain both hydroxyl and acidic groups. It doesn’t necessarily
indicate amphiproticity. Ampholytes is the common terminology for amphoteric substances.
• Among inorganic substances, zinc oxide (ZnO), aluminum oxide (Al2O3), aluminum hydroxide
(Al(OH)3), and lead oxides are amphoteric. In acidic mediums, they act as bases, and in basic
mediums, they act as acids. The most common and well known amphoteric molecule are amino
acids which are organic(carbon containing), which we can observe in all the biological systems.
• Sources: differencebetween.com https://en.wikipedia.org/

Polar(charged) covalent bond
Except for molecules like O2, N2, pure covalent bonds do not
exist
• Difference in electronegativity values between the atoms in a molecule, which
is formed is not equal to zero, which means one atom tries to attract the
shared electrons towards it cause . Hydrophilic molecules do have charges
and can be dissolved in water.
• Electronegativity of oxygen is 3.44 and hydrogen is 2.20. Polar covalent bond
is formed between the atoms

Periodic table of
electronegative
values

Electron affinity and electronegativity
Potential energy of an atom is potential energy stored in nucleus
• The electron affinity of an element is a measurable physical quantity, namely,
the energy released or absorbed when an isolated gas-phase atom acquires an
electron, measured in kJ/mol.
• Electronegativity, on the other hand, describes how tightly an atom attracts
electrons in a bond. It is a dimensionless quantity that is calculated, not
measured. Pauling derived the first electronegativity values by comparing the
amounts of energy required to break different types of bonds. He chose an
arbitrary relative scale ranging from 0 to 4.
• Source: Univ of Wisconsin Madison Chemistry 103 book

Polar, Pure Covalent(Nonpolar) and Ionic
bonds
Based on electronegativity difference
• Image Sources: UW Madison and Research Gate article

Water molecule (CPK colour, Ball Stick)
There is a electronegativity difference and it is a polar molecule
• Electronegativity of oxygen is 3.44 and
hydrogen is 2.20. Polar covalent bond is formed
between the atoms. Oxygen acts as a
electrophile and hydrogen acts as a nucleophile
• The bond angle is determined by
VSEPR(Valence Shell Electron Pair Repulsion)
theory and vector dot product between two
quantities magnitude of the two quantities
multiplied by cosine of the angle between them.

Corey–Pauling–Koltun(CPK) coloring for
atoms
Conventional and across different softwares
Courtesy: Wikipedia

Structures predicted by Valence Shell Electron Pair Repulsion theory
Note: Pyramidal, Planar are at the end & Tri-, Square are at the first
• A central atom is defined in this theory as
an atom which is bonded to two or more
other atoms, while a terminal atom is
bonded to only one other atom.
• VSEPR theory works on the logic that the
bonded pair of electrons tend to repel
each other and electrons are counted as
to be in pairs. So, for an anion(ion moving
towards anode, i.e. negative ions),
electron corresponding to charge needs
to subtracted in the valence shell. In pic,
E represents lone pair. For ex:[PO4]2-, P
should be at centre, 2 electrons are to be
added for VSEPR
Image Courtesy: Byjus

More VSEPR Shapes
CN- Total no. of electron pairs
• VSEPR is often explained to beginners as eight simpler postulates:
• Molecular shape can be determined by the number of electron pairs
present.
• Electron pairs tend to repel one another.
• Electron pairs arrange themselves to minimize the repulsion between
them.
• The valence or outermost electron shell is assumed to be spherical.
• Multiple bonds are accounted as single electron pairs, and bonded
electron pairs as a single pair.
• Lone pair electrons have the maximum repulsion, and bond pair
electrons the minimum.
• All electron pairs assume positions of least repulsion and they could
be in same position(equatorial) or above/below(axial).
• Repulsive interaction of electron pairs is greatest between lone pairs
and least between bond pairs: bond pair – bond pair < lone pair –
bond pair < lone pair – lone pair.
Source:Sigmaaldrich.om

Logic behind the structure of VSEPR theory
(In pic) The structure of water molecules stabilised by many Hydrogen bonds
• VSEPR theory is based on the logic that
the repulsion force of lone pair with
lone pair will be more than the lone
pair bonded pair and bonded pair with
bonded pair. Bond pair electrons are
there due to nuclear electrostatic
attraction force of the atoms and the
electron density is spread out/large; the
probability where they can be found in
pairs and because of this their repulsion
force with other bond pair electron will be
considerably weaker than with the lone
pair, both of which will be weaker than
lone pair repulsion.

How the Bond angle is influenced?
Helps in understanding the structure of the molecule and possibly in protein folding pathways/electron transfer reactions
Courtesy: https://chem.libretexts.org/

Electrostatic Attraction/Repulsion Force
Orbital shapes in hybridization during covalent bond depends on
VSEPR
• When two point objects(that cannot be extended in space) having
charges(same or different in the sense positive or negative) are separated
from each other at a distance, an electrostatic attraction or repulsion forces
act between them. This force is given by Coulomb’s inverse square law states
that F α (q1*q2/r^2) ( α -proportional sign) where q1 and q2 are the magnitude
of the charges and r is the distance between the centre of their objects,
referring to nuclei.
The 1/4pi factor comes from the fact that your radiated electric field from a point charge is spherical. This means that your total energy radiated
doesn't change but it spreads outward, evenly, in all directions→ spherical!
Source: https://en.wikipedia.org

Extending electronegativity to Amino group in Amino
Acid
Bonding between Nitrogen and Hydrogen
• Ιn Amino group in AA, Nitrogen’s atomic number(Z) is
8. It has 6 valence electrons in its valence shell. By
octet rule, in order to get stability like that of inert gas,
that is 8 electrons in its valence shell, nitrogen shares
4 electrons with hydrogen, one pair with carbon and
there is a lone pair of electrons. Electronegativity of
nitrogen and hydrogen are 3.04 and 2.20. There is a
partial charge difference. Nitrogen(electrophile δ+)
attracts electrons 1.04 times more tightly the electrons
to itself than hydrogen(nucleophile- δ-), creating a
dipole moment, which is a vector quantity, whose
value is the magnitude of the charges multiplied by
distance.
Image courtesy: Western Oregon University

Dipole moment is a vector quantity
Direction will be towards the more electronegative atom
• The electric dipole moment is a measure of the separation of positive and
negative electrical charges within a system, that is, a measure of the system's
overall polarity. The SI unit for electric dipole moment is the coulomb-meter
(C⋅m). The debye (D) is another unit of measurement used in atomic physics
and chemistry.
Source: Chem Libretexts

Dielectric constant
Also known as relative permittivity w.r.t. vacuum
• The relative permittivity (in older texts, dielectric
constant) is the permittivity of a material
expressed as a ratio with the electric permittivity
of a vacuum. A dielectric is an insulating
material, and the dielectric constant of an
insulator measures the ability of the insulator to
store electric energy in an electrical field.
• Permittivity is a material's property that affects
the Coulomb force between two point charges in
the material. Relative permittivity is the factor by
which the electric field between the charges is
decreased relative to vacuum.
Courtesy:Wikipedia

Some ways a protein is represented
Ball and Stick, Wireframe, Ribbon, Molecular Surface, Mesh
• CPK coloring is used (screenshot
need to be attached)
• Molecular Surface representation
for binding analysis; mesh(where
the surface is triangulated and the
method is faster) and solid
surface with Solvent Accesible in
the exterior

Peptide plane and bond
Non-polar solvents are solvents that are composed of atoms that have small differences in electronegativity and
they contain bonds between atoms with similar electronegativities, such as carbon and hydrogen.
• Peptide bond is formed between carbonyl(-CO)
and amino group(-NH2)
• Peptide plane is the one having the peptide
bond
• Hydrophobicity is related to polarity(charge on
the molecules) because hydrophobic molecules
are non-polar. Hydrophobic molecules do not
have full or partial charges (difference in
electronegativity between the atoms in the
molecule is not equal to zero) and thus do not
dissolve in water. Hydrophilic molecules do have
charges and can be dissolved in water.

Dihedral angle
Dot product of normal vectors. How?
• A dihedral angle is the angle between two intersecting planes or half-planes.
In chemistry, it is the clockwise angle between half-planes through two sets of
three atoms, having two atoms in common. In solid geometry, it is defined as
the union of a line and two half-planes that have this line as a common edge.

Dihedral angles in protein
Phi, psi and omega
• ϕ - to describe rotation about the N-C(α) bond and involves the C(O)-N-C(α)-
C(O) bonds
• ψ- to describe rotation about the C(α)-C(O) bond and involves the N-C(α)-
C(O)-N bonds
• ω - to describe rotation about the C(O)-N bond and involves the C(α)-C(O)-N-
C(α) bonds

Ramachandran Plot
Courtesy:Wikipedia
• In biochemistry, a Ramachandran plot (also known
as a Rama plot, a Ramachandran diagram or a [φ,ψ]
plot), originally developed in 1963 by G. N.
Ramachandran, C. Ramakrishnan, and V.
Sasisekharan,[1] is a way to visualize energetically
allowed regions for backbone dihedral angles ψ
against φ of amino acid residues in protein structure.
The figure on the left illustrates the definition of the φ
and ψ backbone dihedral angles[2] (called φ and φ'
by Ramachandran). The ω angle at the peptide bond
is normally 180°, since the partial-double-bond
character keeps the peptide planar.

The four levels of structure in protein
• Primary - sequence
• Secondary - helix, loop, turn
• Tertiary- folded structure
based on interactions
between side chains of
amino acids
• Quarternary- assembly of
protein subunits- also could
could be called
supersecondary structure

Domains vs Motifs
Next slide will be on Supersecondary structure and Protein fold
• A motif is similar 3-D structure conserved among different proteins that serves a
similar function. An example from the textbook shows a helix-turn-helix motif.
This is a structure that is seen in unrelated proteins that bind DNA, so the
presence of a helix-turn-helix motif is an indication of a protein's function.
Domains, on the other hand, are regions of a protein that has a specific function
and can (usually) function independently of the rest of the protein. A protein that
a lab studies may have multiple domains. It has a DNA binding domain located
towards the N terminus of the protein, and a catalytic domain that is located
closer to the C-terminus. Theoretically one can separate the domains from each
other and the DNA binding domain will still bind DNA and the catalytic domain
will still perform catalysis. There is some overlap with the definitions of domain
and motif. Some motifs are also considered domains, and vice versa. Protein
domains are the structural and functional units of proteins.

Supersecondary structure and Protein folds
Quarternary structure- Supersecondary structure
• All-α proteins are a class of structural domains in which the secondary structure is composed entirely of
α-helices, with the possible exception of a few isolated β-sheets on the periphery. Common examples
include the bromodomain, the globin fold and the homeodomain fold.
• all-β - All-β proteins are a class of structural domains in which the secondary structure is composed
entirely of β-sheets, with the possible exception of a few isolated α-helices on the periphery.
• Common examples include the SH3 domain, the beta-propeller domain, the immunoglobulin fold and
B3 DNA binding domain.
• α+β - α+β proteins are a class of structural domains in which the secondary structure is composed of
α-helices and β-strands that occur separately along the backbone. The β-strands are therefore mostly
antiparallel.
• Common examples include the ferredoxin fold, ribonuclease A, and the SH2 domain.
• α/β - α/β proteins are a class of structural domains in which the secondary structure is composed of
alternating α-helices and β-strands along the backbone. The β-strands are therefore mostly parallel.

Types of proteins
Reference: https://scop.mrc-lmb.cam.ac.uk/term/4
• Proteins that fold into compact units. Soluble in aqueous solutions are globular
proteins
• Proteins associated with a cell or organelle are called membrane proteins
• Fibrous Proteins- Mainly structural proteins. A characteristic feature of the
fibrous protein sequences is the presence of repetitive sequence motifs.
• Regions of proteins or entire proteins that at native conditions may lack
ordered structure but in their functional state can undergo disorder-to-order
transition - Non-globular/intrinsically unstructured proteins

Types of intermolecular and intramolecular
• There are intermolecular(occurs from one molecule to another molecule) and
intramolecular forces (within the same molecule) in biomolecules.
• Covalent, ionic and metallic are three types of intramolecular forces
• Dispersion(London Dispersion and Van Der Waals), Dipole–Dipole, Hydrogen
Bonding, and Ion-Dipole are the four types of intermolecular forces
• Intermolecular forces are generally much weaker than covalent bonds. For
example, it requires 927 kJ to overcome the intramolecular forces and break both
O–H bonds in 1 mol of water, but it takes only about 41 kJ to overcome the
intermolecular attractions and convert 1 mol of liquid water to water vapor at
100°C. But seemingly low value, the intermolecular forces in water are strongest
known forces.

Importance of intermolecular forces
The attractive force between two different molecules
• Intermolecular forces determine bulk properties, such as the melting points of
solids and the boiling points of liquids. Liquids boil when the molecules have
enough thermal energy to overcome the intermolecular attractive forces that
hold them together, thereby forming bubbles of vapor within the liquid.
Similarly, solids melt when the molecules acquire enough thermal energy to
overcome the intermolecular forces that lock them into place in the solid.

Instantaneous dipole-induced dipole interactions
Instantaneous dipole–induced dipole interactions between nonpolar
molecules can produce intermolecular attractions just as they
produce interatomic attractions in monatomic substances like Xe.
This effect, illustrated for two H2 molecules in part (b)
, tends to become more pronounced as atomic and molecular
masses increase
For example, Xe boils at −108.1°C, whereas Helium boils at
−269°C. The reason for this trend is that the strength of London
dispersion forces is related to the ease with which the electron
distribution in a given atom can be perturbed. In small atoms such
as He, the two 1s electrons are held close to the nucleus in a very
small volume, and electron–electron repulsions are strong enough
to prevent significant asymmetry in their distribution. In larger atoms
such as Xe, however, the outer electrons are much less strongly
attracted to the nucleus because of filled intervening shells. As a
result, it is relatively easy to temporarily deform the electron
distribution to generate an instantaneous or induced dipole. The
ease of deformation of the electron distribution in an atom or
molecule is called its polarizability. Because the electron distribution
is more easily perturbed in large, heavy species than in small, light
species, we say that heavier substances tend to be much more
polarizable than lighter ones.

Summary of intermolecular forces- dipole-dipole(DD), LDF and hydrogen
bonds (DD+LDF=VdW)

Intermolecular force- Hydrogen bonding
• Hydrogen atom which shares its valence electron with an atom based on electronegative difference
is attracted by another atom which has better electronegative value and a bond(non-covalent, non-
ionic); based on nuclear force is formed
• This intermolecular bond can be found commonly in helices; DNA structure stabilisation
• Notice that in each of these molecules NH3, HF, H2O:
• The hydrogen is attached directly to a highly electronegative atoms, causing the hydrogen to acquire
a highly positive charge.
• Each of the highly electronegative atoms attains a high negative charge and has at least one "active"
lone pair. Lone pairs at the 2-level have electrons contained in a relatively small volume of space,
resulting in a high negative charge density. Lone pairs at higher levels are more diffuse and, resulting
in a lower charge density and lower affinity for positive charge.
• The δ+ hydrogen is so strongly attracted to the lone pair that it is almost as if you were beginning to
form a co-ordinate (dative covalent) bond. It doesn't go that far, but the attraction is significantly
stronger than an ordinary dipole-dipole interaction.

Bioinformatics Databases and
some tools
Based on Database Schema on Relational Database ; makes
use of primary key(unique) and secondary key. A database
schema defines how data is organized within a relational
database; this is inclusive of logical constraints such as, table
names, fields, data types, and the relationships between
these entities.

Primary and
Secondary
Databases
Primary Database get first hand
information report on the
sequences and Secondary
database make use of first hand
information report

Interpro & Ensembl
Also included FASTA
• InterPro provides functional analysis of proteins by classifying them into families and predicting
domains and important sites. To classify proteins in this way, InterPro uses predictive models,
known as signatures, provided by several different databases (referred to as member databases)
that make up the InterPro consortium. We combine protein signatures from these member
databases into a single searchable resource, capitalising on their individual strengths to produce a
powerful integrated database and diagnostic tool.
• Ensembl is a genome browser for vertebrate genomes that supports research in comparative
genomics, evolution, sequence variation and transcriptional regulation.
• Functional Analysis is used to assign biological or biochemical roles to proteins. Protein Functional
Analysis using the InterProScan program.
• FASTA is a pairwise sequence alignment tool which takes input as nucleotide or protein
sequences and compares it with existing databases. FASTA format is a text-based format for
representing either nucleotide sequences or amino acid (protein) sequences, in which it begins
with a carat symbol, followed by seq ID

Types of FASTA and info on ExPASY
ProtParam is a tool within ExPASY
• FASTA is a pairwise sequence alignment .FASTX - Compares a DNA query to a
protein database. It may introduce gaps only between codons. FASTY - Compares a
DNA query to a protein database, optimizing gap location, even within codons.
TFASTA - Compares a protein query to a DNA database.
• The ExPASy (the Expert Protein Analysis System) World Wide Web server
(http://www.expasy.org), is provided as a service to the life science community by a
multidisciplinary team at the Swiss Institute of Bioinformatics (SIB).
• ProtParam (References / Documentation) is a tool which allows the computation of
various physical and chemical parameters for a given protein stored in Swiss-Prot or
TrEMBL or for a user entered protein sequence. The computed parameters include
the molecular weight, theoretical pI, amino acid composition, atomic composition,
extinction coefficient, estimated half-life, instability index, aliphatic index and grand
average of hydropathicity (GRAVY)

UniProtKB/Swiss-Prot & PIR
Protein Information Resource(PIR)
• The UniProt Knowledgebase (UniProtKB) is the central hub for the collection of
functional information on proteins, with accurate, consistent and rich annotation.
• UniProtKB/Swiss-Prot is a manually annotated, non-redundant protein sequence
database. It combines information extracted from scientific literature and
biocurator-evaluated computational analysis. The aim of UniProtKB/Swiss-Prot is
to provide all known relevant information about a particular protein.
• PIR maintains the Protein Sequence Database (PSD), an annotated protein
database containing over 283 000 sequences covering the entire taxonomic
range. Family classification is used for sensitive identification, consistent
annotation, and detection of annotation errors.

PIR(Protein Information Resource)
TrEMBL - Translated EMBL
• PIR maintains the Protein Sequence Database (PSD), an annotated protein database
containing over 283 000 sequences covering the entire taxonomic range
• PIR also maintains NREF(the Non-redundant REFerence (NREF) sequence
database, a non-redundant reference database, and iProClass, an integrated
database of protein family, function, and structure information. PIR-NREF provides a
timely and comprehensive collection of protein sequences, currently consisting of
more than 1 000 000 entries from PIR-PSD, SWISS-PROT, TrEMBL, RefSeq,
GenPept, and PDB.
• UniProtKB/TrEMBL contains the translations of all coding sequences (CDS) present
in the EMBL/GenBank/DDBJ Nucleotide Sequence Databases and also protein
sequences extracted from the literature or submitted to UniProtKB/Swiss-Prot.
The database is enriched with automated classification and annotation.

Annotation in Genome and Protein
• Genome annotation is the process of deriving the structural and functional information of a
protein or gene from a raw data set using different analysis, comparison, estimation, precision,
and other mining techniques.
• Sequence annotations in UniProtKB describe regions or sites of interest in the protein
sequence, such as post-translational modifications, binding sites, enzyme active sites, local
secondary structure or other characteristics reported in the cited references. Sequence
conflicts between references are also described in this manner.
• Structural genomics seeks to describe the 3-dimensional structure of every protein encoded by
a given genome. This genome-based approach allows for a high-throughput method of
structure determination by a combination of experimental and modeling approaches. The
principal difference between structural genomics and traditional structural prediction is that
structural genomics attempts to determine the structure of every protein encoded by the
genome, rather than focusing on one particular protein.
Also covered Structural genomics

NCBI ORF Finder
And a few more tools
• The Coding Sequence (CDS) is the actual region of DNA that is translated to
form proteins. While the ORF may contain introns as well, the CDS refers to
those nucleotides(concatenated exons) that can be divided into codons which
are actually translated into amino acids by the ribosomal translation
machinery. In Prokaryotes the ORF and the CDS are the same.
• NCBI ORF Finder helps in finding the Open Reading Frame
• Also there are tools which make use of HMM(Hidden Markov Models) and
Gibbs Sampler (Need to read more …)

Some softwares from NIH NCBI
CASP is a community which awards the prediction techniques
• Gene3D: Assigning CATH Structures to all Protein Sequences-takes CATH
domain families (from PDB structures) and assigns them to the millions of
protein sequences with no PDB structures. (Mask of Zorro kind of hero)
• Gene Ontology(GO) - Controlled vocabulary describing the function of a gene
product in any organism. There are three independent sets of vocabularies, or
ontologies; that describe molecular function of the gene product, the biological
process in which it participates, and the cellular component where it can be
found.
• OMIM (Online Mendelian Inheritance in Man)- Genetic variations and
phenotypes
• Critical Assessment of Techniques for Protein Structure Prediction (CASP)

CATH- Class, Architecture, Topology and Hierarchy Database
• The latest version of CATH (class, architecture, topology, homology) contains 114,215 domains,
2178 Homologous superfamilies and 1110 fold groups.
• 20,330 new domains, 87 new homologous superfamilies and 26 new folds have been developed
since CATH release version 3.1.
• A total of 28,064 new domains have been assigned since CATH version 3.0.
• The CATH website has been completely redesigned and includes more comprehensive
documentation.
• The CATHEDRAL structure comparison algorithm has been improved and used to characterize
structural diversity in CATH superfamilies and structural overlaps between superfamilies.
• Although the majority of superfamilies in CATH are not structurally diverse and do not overlap
significantly with other superfamilies, approximately 4% of superfamilies are very diverse and these
are the superfamilies that are most highly populated in both the PDB and in the genomes.
• Information on the degree of structural diversity in each superfamily and structural overlaps between
superfamilies can now be downloaded from the CATH website.

SCOP - Structural Classification of Proteins DB
It is also a DB focussing on Structural Classification of Proteins

Types of Basic Local Alignment Search
Tool(BLAST)
Program Database Query Typical uses
BLASTN Nucleotide Nucleotide
Mapping oligonucleotides, cDNAs, and PCR products to a genome; screening repetitive elements; cross-species sequence exploration; annotating genomic DNA; clustering
sequencing reads; vector clipping
BLASTP Protein Protein Identifying common regions between proteins; collecting related proteins for phylogenetic analyses
BLASTX Protein
Nucleotide translated into
protein
Finding protein-coding genes in genomic DNA; determining if a cDNA corresponds to a known protein
TBLASTN
protein
Protein Identifying transcripts, potentially from multiple organisms, similar to a given protein; mapping a protein to genomic DNA
TBLASTX
protein
protein
Cross-species gene prediction at the genome or transcript level; searching for genes missed by traditional methods or not yet in protein databases

Dipole moments
Dipole moments can be permanent due to the bond formation and differing
electronegativities of the molecules involved in the bond.
Dipole moments can be temporary/induced due to the introduction of an
electric field or presence of an ion nearby
Potential Energy on a molecular level: Energy stored in bonds and static
interactions are: Covalent bonds, Electrostatic forces, Nuclear forces
50

London Dispersion Force
LDFs exist for all substances, whether composed of polar(charged) or nonpolar
molecules. LDF arise from the formation of temporary instantaneous polarities
across a molecule from the circulations of electrons. It can arise due to
permanent or temporary or induced dipole moment.
An instantaneous polarity in one molecule may induce an opposing polarity in an
adjacent molecule, resulting in a series of attractive forces among neighboring
molecules.
Molecules with higher molecular weights have more electrons. This makes their
electron clouds more deformable from nearby charges, a characteristic called
polarizability.
As a result, molecules with higher molecular weights have higher LDF and
consequently have higher melting points, boiling points and enthalpies of
vaporization. 51

When electronegativity difference is >1.9, ionic bonds-
crystal lattice structure is formed
When the electronegativity difference
between bonded atoms is moderate to
zero, i.e., usually less than 1.9, the
bonding electrons are shared between
the bonded atoms, as illustrated in Fig.
3.9.4. The attractive force between the
bonding electrons and the nuclei is the
covalent bond that holds the atoms
together in the molecules. The covalent
bond is usually weaker than the metallic
and the ionic bonds but much stronger
than the intermolecular forces.
53

Which force is greater?
Metallic>Ionic>Covalent>Intermolecular(Hydrogen Bonding>Dipole>London
Dispersion Force)
The electronegativity difference between H and O, N, or F is usually more
than other polar bonds. The charge density on hydrogen is higher than the
δ+ ends of the rest of the dipoles because of the smaller size of hydrogen.
The δ+ Hydrogen can penetrate in less accessible spaces to interact with the
δ- O, N, or F of the other molecule because of its small size.
54

Components in DNA and RNA
Purine is a heterocyclic
aromatic organic compound
composed of a pyrimidine ring
fused with imidazole ring.
Pyrimidine is a heterocyclic
aromatic organic compound
that is composed of carbon
and hydrogen. It comprises
adenine and guanine as
nucleobases.
55

Electron acceptor and electron donor
Electron donor and electron acceptor is called nucleophile and electrophile in
enzyme and protein chemistry
The same is called as protonated and deprotonated or as a base and a
acid. According to Lewis theory, acid is an electron pair acceptor and a base
is an electron pair donor
Useful terms in understanding Chymotrypsin activesite mechanism: Alkoxide
ion is an ion with a negative formal charge on oxygen atom. Acyl group is
formed by replacement of OH from oxo acid i.e. R-C=O
56

Protein_Basics.pptx

Recommended

Recommended

More Related Content

Similar to Protein_Basics.pptx

Similar to Protein_Basics.pptx (20)

Recently uploaded

Recently uploaded (20)

Protein_Basics.pptx