Proteins Medical Chemistry Lecture 14 2007 (J.S.)
Proteins Proteins are polypeptides that exhibit higher levels of structural organization . Their biologically active conformation is established by the process called protein folding that is cotranslational (it takes place before the newly synthesized polypeptide releases the ribosome). All of the information required for a protein to fold is contained in the primary structure. In this respect, proteins differ qualitatively from other peptides, notwithstanding the lengths of their chains. Quantitative respects are of minor importance. Small proteins comprise more than approx. 50 aminoacyl residues , a large number of proteins comprise hundreds of residues . More than one ore two thousand residues in one peptide chain occur rather exceptionally (e.g. thyroglobulin, or titin in skeletal muscles). Most of proteins have the relative molecular mass M r in the range from 6 000 to 200 000 (about 110 per one aminoacyl residue).
Structural proteins – cytoskeleton proteins, collagens, elastin, keratins, proteins enabling movement (tubulin, non-muscle actin) and contraction of muscle cells (muscle actin and myosin), etc.
Metabolic functions – enzymes (biocatalysts), transducers of energy (e.g. rhodopsin), membrane transporters, transport proteins in blood plasma, nutritive function, maintaining of the oncotic pressure of blood plasma, protein buffers, etc.
Transfer of information – signal proteins (chemical messengers), receptor proteins, immunoglobulins (circulating antibodies¨), etc.
Functions of proteins
Every polypeptide chain of proteins consists of the main chain ( polypeptide backbone ), in which the nitrogen atoms of -amino groups, -carbons, and carbons of -carboxyls alternate regularly. Side chains of the involved aminoacyl residues represent the branches attached to the main chain at -carbons: R – side chains main chain
Three levels of organization occur in all proteins: – primary structure, – secondary structure, and – tertiary structure. Not all proteins have a quaternary structure . Such proteins are clusters of two or more subunits (monomers or protomers) held together by non-covalent interactions . Subunits may have either their own primary, secondary, and tertiary structures, or they may be identical. Hierarchical organization of protein structure – four levels of structural organization
The primary structure of a protein is the sequence of amino acyl residues in its polypeptide chain .
By convention, the sequence is described from the
N -terminal residue to the COOH-end.
This simplest level of structural organization is in some respects the most important :
T he specific protein conformation (higher levels of a protein structure) and the biological function of a protein are determined by its primary structure .
Example: The primary structure of human insulin A and B chains Insulin is formed by hydrolytic excision of the C-peptide from proinsulin ; the structure of proinsulin has the sequence B chain–C-peptide–A chain, C-peptide connects the C-end of B chain and the N-end of A chain. The figure demonstrates the covalent structure of insulin. Besides the primary structures of both chains, the positions of three disulfide bridges connecting remote parts of the molecule are described (an important part of the insulin tertiary structure).
The secondary structure is a local conformation of the backbone atoms in particular segments with no regard to the side chains and to the relations of the segment to other remote segments of the polypeptide chain.
Secondary structure The spatial arrangement of the main chain segments is various, namely in globular proteins, due to rotations round the N–C and C –C carbonyl bonds.
For example, t orsion angle describes rotation r ou nd the N–C bond Torsion angles Torsion angle – rotation round the C –C carbonyl bond Torsion angle – free rotation round the C carbonyl –N is not possible , either trans- peptide bond = 180 °, or rare cis- peptide bond = 0° = + 60 ° = – 120 ° Any conformation can be described by the torsion angles , , and : C N H C carbonyl C O
Conformation map ( Ramachandran diagram ) shows sterically allowed and angles calculated using the van der Waals limiting distances for interatomic contacts . The conformational range of a polypeptide backbone is limited, there are many steric constraints caused by the steric interference of side chains attached at C -carbons. regions of "normally allowed" angles for poly-Ala conformations having "outer limit" v.d.W. distances far greater conformational freedom for Gly residues
– nonrepetitive – bends that abruptly change the direction of chains
– compact loops
– kinks and bends in helical structures
– bends and bulges of -sheets
– various loops and coil configurations (not random!)
– regions truly disordered (Lys side chains, N- or C- termini that wave around in solution
The -helix is right-handed . -Helix Helical pitch of 0.54 nm (the distance the helix rises along its axis per turn). 3.6 residues per turn. The core is tightly packed, the atoms are in v.d.W., contact. Stabilized by hydrogen bonds. -Helices have an average span of about 11 residues (3 turns), though helices with as many as 53 residues have been found. -Helix is a common element of both fibrous and globular proteins.
In the -helices, hydrogen bonds are formed between the carbonyl group of the residue i and the amino group of the residue i + 4 (the H-bond connects the 1 st and the 5 th residue): H-bonds are nearly parallel with the helix axis
-Helix - top view: Positions where side chains are attached are projected down the helix axis onto a plane. Diameter of the -helix (without side chains) about 0.5 nm ; Diameter inclusive of side chains from 1.1 nm to 1.5 nm . side chains Side chains (the R groups) all project backward and outward of the helix .
( 3.6 amino acid residues per one turn, 13 number of atoms in the heterocycles closed by hydrogen bridges) -Helix 3.6 13; -Helix 4.4 16 is comparatively wide and flat (pitch of 0.52 nm); it is less stable, it has an axial hole. Only rarely observed at the ends of longer -helices. Helix 3 10 is thinner and rises more steeply than does the -helix (pitch of 0.60 nm); it most often occurs as a single turn , transition between one end of an -helix and the adjoining portion of a polypeptide chain. Bragg's notation for helical secondary structures
- Structure of a peptide main chain differs slightly from the fully extended chain conformation (all- trans , = 180°, = 180°) by somewhat lower values of torsion angles ( = –140°, = +150°). The main chain has a pleated-edge on appearance, from which the side chains extend alternately to opposite sides: Within the chain of this sort, hydrogen bonding cannot exist as in -helices. Chains are usually stabilized by hydrogen bonds between neighbouring chains having the same structure to give the - pleated sheet s .
The two-stranded antiparallel -pleated sheet Neighbouring hydrogen bonded polypeptide chains run in opposite directions. Top view: Side view: The two-stranded parallel -pleated sheet Neighbouring hydrogen bonded chains extend in the same direction. Top view:
The connections between adjacent polypeptide strands in -pleated sheets: The hairpin connection between antiparallel strands. Sheets may contain 2 – 15 strands, 6 strands on average. A right-handed crossover connection between successive strands of parallel -sheet. Sheets comprise more than 5 strands. The usual length of a -structure is from 6 –15 amino acid residues. -Sheets exhibit a pronounced right-handed twist or curl.
The primary structure of collagen is unusual. The polypeptide
chains 1 and 2 of collagen I (the most common type of collagen)
have a regularly repeating sequence of amino acid residues in which glycine is found at every third residue (Gly-X-Y). The Xs and Ys are often proline or hydroxyproline (about one quarter of the amino acid residues in collagen).
Steep helix of the tropocollagen chains The pyrrolidine ring of proline residues strongly restricts the geometry of the main chain of the protein that contains it – prolyl residues introduce abrupt changes (bends) in the direction of the chain. N N O O H
Steep left-handed helix of tropocollagen single chains Single helical chains are stabilized by formation of interchain H-bonds within the right-handed triple helix – the tropocollagen units. 30 amino acid residues per turn, helical pitch of 8.6 nm. Both C=O and NH groups are directed outward of the helix (perpendicular to the helix axis) so that they cannot form intrachain H-bonds. 3.0 – 3.3 amino acid residues per turn, helical pitch of 0.86 – 1.00 nm.
Reverse turns ( - bends ) often connect successive strands of antiparallel -strands or helical segments at protein surfaces and rapidly change the chain directions. Four amino acid residues stabilized by the H-bond between the first and the fourth residue. Gly and Pro are oft in positions 2 and 3. Both types differ by a 180 ° flip of the peptide unit linking residues 2 and 3. type I type II
Tertiary structure The tertiary structure of a protein (or of protein subunits) is the three-dimensional arrangement of all its atoms , including those of its side chains . The stability of this biologically active, or native, conformation depends on interactions between the side chains of amino acid residues, which include – ionic interactions (salt bridges) , – hydrogen bonds, – hydrophobic interactions, and – covalent cross-links..
Electrostatic interactions (salt bridges) exist between the positively charged side chains of basic amino acids lysine (–NH 3 + ), arginine (guanidinium), histidine (imidazolium), and the carboxylate anions of acidic side chains in residues of aspartate and glutamate . An isolated charged residue in never found in the hydrophobic interior of a globular protein. Two oppositely charged ions , however, form an ion pair.
Hydrogen bonds Groups –CO-NH– of the main chain stabilize the secondary structure. In addition, they can form H-bonds with polar side chain of amino acid residues o r with water. Polar groups with hydrogen-bonding ability occur in the side chains of serine and threonine (alcoholic hydroxyl), tyrosine (phenolic hydroxyl), asparagine and glutamine (group –CO-NH 2 ), cysteine (sulfanyl group), and histidine (nitrogen atom of non-ionized imidazole). Those group can form H-bonds with water, with one another, or with the –CO-NH– groups of the main chain.
Hydrophobic interactions are both weak van der Waals forces between nonpolar side chains of amino acids (e.g. branched-chain valine, leucine, isoleucine, or aromatic rings of phenylalanine and tryptophan) and hydrophobic effect. In aqueous solutions of globular proteins, a polypeptide chain folds in a way that removes hydrophobic side chains from contact with water so that they are in contact with one another in the centre of the protein, not with water. Then the cage structure round the protein is of minimal size that results in relative increase in entropy.
Covalent bonds stabilizing the tertiary structure are besides peptide bonds of the main chain covalent bonds between the side chains of residues: Disulfide bridges between the sulfanyl groups of cyst e ine: Other covalent cross-links E.g. products of reaction between the amino groups in side chains of lysine with the modified lysine side chains comprising the aldehyde group (the result of oxidation of lysine to allysine ) – aldol type or aldimine type of cross-links . HC C=O NH CH C=O NH N H (hydrogenated aldimine)
The tertiary structure of haemoglobin subunit The side chains that fill in the interhelical space are not drawn.
Three short -helices (5 – 12, 28 – 35, 48 – 55) Two- stranded -pleated sheet (70 – 110) Bovine ribonuclease (ribbon model) Deep cleft (active centre)
Domains The tertiary structure of proteins, especially large proteins containing more than 200 residues, frequently consists of several domains – compact units connected by the short peptide chains. Those domains are relatively independent on other domains and may exhibit different biological activities.
Immunoglobulins Variable domains are responsible for the specifity of the antibody, constant domains fulfil other functions. Two heavy and two light chains are joined through disulfide bridges. Each light chain consists of two domains, one of which is the variable domain Each heavy chain consists of four domains – one variable and three constant.
Quaternary structure Some proteins exist as oligomers consisting of several subunits (protomers), which are linked only through non-covalent bonds . Quaternary structure refers to the number of subunits, the spatial arrangement of protomers in a oligomer, and the types of non-covalent bonds. Examples: haemoglobin (four subunits of two types), myosin (six polypeptide chains of three types), lactate dehydrogenase (four subunits of two types),
Quaternary structure of haemoglobin α 2 α 1 β 2 β 1 4 O 2 α 1 α 2 β 1 β 2 deoxygenated haemoglobin ( 2,3-bisphosphoglycerate ) T -conformation oxyhaemoglobin R -conformation α 1 O 2 α 2 O 2 β 1 O 2 β 2 O 2 4 O 2
Classification of proteins The old classification was based mostly on the solubility of simple proteins (e.g., albumins, globulins, histones) and on the prosthetic group type of conjugated proteins (metalloproteins, phosphoproteins, glycoproteins, lipoproteins, and nucleoproteins). Nowadays, it is used no more. Three major groups of proteins: – globular , – fibrous , and – membrane proteins Globular proteins may be classified further according to the prevalent secondary structure. α proteins (for example, haemoglobin in 65 %), β proteins, α+β proteins (separated segments, e.g. lysozyme), α/β proteins (alternating segments, e.g. glycolytic enzymes). Fibrous proteins differ from each other in a broad range, too. In keratin , tropomyosin and light segments of myosin prevail α-helical 2 o structure, triple-helices of steep-helical chains are typical for collagen , actin filaments are polymers of the globular monomer (G-actin)..
Membrane proteins are inserted in lipid bilayer or bound to either surface. Membrane proteins
Integral non-penetrating membrane protein bound through hydrophobic interactions with lipid bilayer
Integral membrane penetrating ( glyco ) proteins Type I Type II less common "reversed" type, e.g. transferrin receptor Type III PI-link Type IV e.g. superfamily of receptors interacting with G-proteins