Chap.3 Protein Structure & FunctionTopics• Hierarchical Structure of Proteins• Protein Folding• Examples of Protein Function- Ligand-binding Proteins & Enzymes• Regulating Protein Function by Protein Degradation• Regulating Protein Function by Noncovalent and Covalent ModificationsGoalsLearn the basic structure andproperties of proteins andenzymes, which carry out mostof the work in cells (Fig. 3.1).
Overview of Protein Structure Hierarchy The four levels of protein structure are illustrated in Fig. 3.2. A detailed discussion of each of these levels is presented in the next few slides. Experiments have shown that the final 3D tertiary structure of a protein ultimately is determined by the primary structure (amino acid sequence). The 3D fold (shape) of the protein determines its function.
Primary StructureThe primary structure of aprotein refers to its amino acidsequence. Amino acids in peptides(<30 aas) and proteins (typically200 to 1,000 aas) are joinedtogether by peptide bonds (amidebonds) between the carboxyl andamino groups of adjacent aminoacids (Fig. 3.3). The backbone ofall proteins consists of a [-N-Ca(R)-C(O)-] repetitive unit.Only the R-group side-chainsvary. By convention, proteinsequences are written from left-to-right, from the protein’s N-to C-terminus. The average yeastprotein contains 466 amino acids.Because the average molecularweight of an amino acid is 113daltons (Da), the average Nmolecular weight of a yeast Ca(R)protein is 52,728 Da. Note that1 Da = 1 a.m.u. (1 proton mass).
Secondary Structure: a HelixSecondary structure refers toshort-range, periodic foldingelements that are common inproteins. These include the a helix,the b sheet, and turns. In the ahelix (Fig. 3.4), the backboneadopts a cylindrical spiral structurein which there are 3.6 aas perturn. The R-groups point out fromthe helix, and mediate contacts toother structure elements in thefolded protein. The a helix isstabilized by H-bonds betweenbackbone carbonyl oxygen and amidenitrogen atoms that are orientedparallel to the helix axis. H-bondsoccur between residues located inthe n and n + 4 positions relative toone another.
Secondary Structure: b Sheets & TurnsIn b sheets (a.k.a. “pleatedsheets”), each b strand adopts anextended conformation (Fig. 3.5).ß strands tend to occur in pairsor multiple copies in b sheetsthat interact with one anothervia H-bonds directedperpendicular to the axis of eachstrand. Carbonyl oxygens andamide nitrogens in the strandsform the H-bonds. Strands canorient antiparallel (Fig. 3.5a) orparallel (not shown) to oneanother in b sheets. R-groups ofevery other amino acid point upor down relative to the sheet(Fig. 3.5b). Most ß strands inproteins are 5 to 8 aas long. ßTurns consist of 3-4 amino acidsthat form tight bends (Fig. 3.6).Glycine and proline are common inturns. Longer connecting ß turnsegments between ß strands arecalled loops.
Tertiary StructureTertiary structure refers to thefolded 3D structure of a protein.It is also known as the nativestructure or active conformation.Tertiary structure mostly isstabilized by noncovalentinteractions between secondarystructure elements and otherinternal sequence regions thatcannot be classified as a particulartype of secondary structure. Thefolding of proteins is thought tobe driven by the need to place themost hydrophobic regions in theinterior out of contact with water(Fig. 3.7). The structures ofhundreds of proteins have beendetermined by techniques such asx-ray crystallography and NMR.Different methods of representingstructures are shown in Fig. 3.8.Keep in mind that most proteins are somewhat flexible andundergo subtle conformational changes while carrying out theirfunctions.
Secondary Structure MotifsSecondary structure motifs are evolutionarily conservedcollections of secondary structure elements which have a definedconformation. They also have a consensus sequence because theaa sequence ultimately determines structure. A given motif canoccur in a number of proteins where it carries out the same orsimilar functions. Some well known examples such as the coiled-coil, EF hand/helix-loop-helix, and zinc-finger motifs areillustrated in Fig. 3.9. These motifs typically mediate protein-protein association, calcium/DNA binding, and DNA or RNAbinding, respectively.
Quaternary StructureMultisubunit (multimeric)proteins have another levelof structural organizationknown as quaternarystructure. Quaternarystructure refers to thenumber of subunits, theirrelative positions, andcontacts between theindividual monomers in amultimeric protein. Thequaternary structure ofthe trimeric hemagglutininsurface protein ofinfluenza virus is shown inFig. 3.10b. The tertiarystructure of ahemagglutinin monomer isshown in Fig. 3.10a.
Modular Domain Structure of ProteinsDomains are independently folding and functionally specializedtertiary structure units within a protein. The respectiveglobular and fibrous structural domains of the hemagglutininmonomer (which happen to be individual polypeptide chains) areillustrated above in Fig. 3.10a. Domains (such as the EGFdomain) also may be encoded within a single polypeptide chain,as illustrated in Fig. 3.11. Domains still perform theirstandard functions although fused together in a longerpolypeptide (e.g., DNA binding and ATPase domains of atranscription factor). The modular domain structure of manyproteins has resulted from the shuffling and splicing togetherof their coding sequences within longer genes. Epidermal growth factor (EGF) domain
Supramolecular StructureIn many cases, multimeric proteinsachieve extremely large sizes,e.g., 10s-100s of subunits. Suchcomplexes exhibit the highest levelof structural organization known assupramolecular structure. Examplesinclude mRNA transcriptionpreinitiation complexes (Fig. 3.12),ribosomes, proteasomes, andspliceosomes. Typically,supramolecular complexes functionas ”macromolecular machines" inreference to the fact that theactivities of individual subunits arecoordinated in the performance ofsome overall task (e.g., proteinsynthesis by the ribosome).
Evolution of Protein FamiliesThrough genome sequencingand classical gene cloningapproaches, the sequencesof an enormous number ofproteins have been compiled.Comparison of sequencesshows that most proteinsbelong to larger familiesthat have evolved over timefrom a common ancestorprotein, as illustrated forthe globin family of O2binding proteins (Fig. 3.13).Proteins that have a commonancestor are calledhomologs. The members of aprotein family often show>30% sequence ID, have acommon 3D fold, and usuallyperform closely relatedfunctions.
Structure of the Globin ProteinsThese globular proteins are composed of mostly a helicalsecondary structure. The similar folds of the globins can bereadily seen by comparing the structures of the b chain ofhemoglobin, myoglobin, and leghemoglobin (Fig. 3.13). The closelysimilar structures of mammalian myoglobin and the hemoglobin bsubunit might be expected, but the resemblance of the distantlyrelated plant leghemoglobin isstriking. Comparison of thesequences of the members ofprotein families has broughtto light the fact that aminoacids within a given classexhibit a large degree offunctional redundancy. Inthis regard, the 3 proteinsdiscussed here exhibit lessthan 20% identity in theirsequences, yet have thesame structure. Lastly, inhemoglobin 2 different globinchains have combined to forma multisubunit protein.
Overview of Protein FoldingMany experiments have shown thatproteins can spontaneously foldfrom an unfolded state to theirfolded native state. This provesthat the amino acid sequencecontains enough information tospecify tertiary structure. Bondswithin the peptide backbone seekout different possibleconformations as the final tertiarystructure is achieved (Fig. 3.14).Folding tends to occur viasuccessive conformational changesleading to secondary and thentertiary structure elements (Fig.3.15). The native conformation ofa protein typically is its lowestfree energy, and therefore, moststable structure. The unfolded(denatured) conformation of aprotein can be generated byheating or treatment with certainorganic solvents.
Chaperone-assisted Protein FoldingThe folding of many proteins, particularly large ones, iskinetically slow and is assisted in vivo by folding agents known aschaperones. These proteins are found in all organisms and even indifferent organelles of eukaryotic cells. Chaperones assist in 1)folding of nascent polypeptides made by translation, and 2) re-folding of proteins denatured by environmental damage, such asheat shock. Molecular chaperones bind to unfolded nascentpolypeptide chains as theyemerge from the ribosome,and prevent aggregation,misfolding, and degradation(Fig. 3.16). The hydrolysisof ATP by the chaperonedrives conformationalchanges that preventaggregation and help driveprotein folding. Accessoryproteins participate in theprocess. Eukaryoticmolecular chaperones suchas Hsp 70 (cytosol & mitomatrix) and BiP (ER) arerelated to the bacterialprotein DnaK.
ChaperoninsEukaryotic chaperonins such as the TriC complex are largemultimeric complexes related to the bacterial GroEL and GroESproteins. These complexes take up unfolded proteins into aninternal chamber for folding (Fig. 3.17). ATP hydrolysis drivesfolding.
Neurodegenerative DiseasesIn neurodegenerative diseasessuch as Alzheimers disease andtransmissible spongiformencephalopathy (mad cow),insoluble misfolded proteinsaccumulate in the brain inpathological lesions known asplaques, resulting inneurodegeneration (Fig. 3.18).In Alzheimers disease, theprotein known as amyloidprecursor protein is cleaved intoa peptide product (b-amyloid)that aggregates and precipitatesin amyloid filaments. Themisfolding of b-amyloid, whichinvolves a transition from ahelical to b sheet conformationleads to filament formation. Inmad cow disease, prion proteinsprecipitate causing lesions.
Ligand-binding ProteinsThe term ligand refers to any molecule that can be bound by aprotein. Ligands may be hormones, metabolites, or even otherproteins. Ligand binding requires molecular complementarity. Thegreater the degree of complementarity, the higher the specificityand affinity of the interaction. Affinity is reflected in the Kd forbinding. Protein-ligand binding is illustrated here for antibodies(Fig. 3.19a). The complementarity-determining regions (CDRs) ofthe antibody make highly specific contacts with epitopes in theantigen (Fig. 3.19b). CDR Epitope (a)
Overview of Enzyme Catalysis IEnzymes are proteins (a few are RNAs called ribozymes) thatcatalyze chemical reactions within living organisms. Enzyme-catalyzed reactions typically are highly specific, and rateenhancements of 106-1012 are common. In an enzyme-catalyzedreaction, the reactant (the substrate) is converted into theproduct. Like all catalysts, enzymes are not consumed in areaction. Further, they do not change the ∆G0 or Keq for thereaction, only its rate.Rate enhancement isachieved due to thefact that enzymes aremost complementary tothe transition statestructure formed inthe reaction. Thisresults in stabilizationof the transition stateand lowering of theactivation energybarrier (∆G‡) for thereaction (Fig. 3.20).
Overview of Enzyme Catalysis IIThe transformation of a substrate to theproduct occurs in the active site of anenzyme. The active site can be subdividedinto a catalytic site wherein amino acidsthat catalyze the reaction reside, and abinding pocket that recognizes a specificfeature of the substrate, conferringspecificity to the enzyme-substrateinteraction. A schematic model for anenzyme catalyzed reaction is shown in Fig.3.23. The kinetic equation describing thereaction E + S ES E + P. A reactioncoordinate diagram showing the binding andcatalytic steps of an enzyme catalyzedreaction is shown in Fig. 3.24.
Enzyme Kinetics: Enzyme ConcentrationThe velocity of an enzyme-catalyzed reaction reaches a maximalrate (Vmax) at high concentrations of substrate (Fig. 3.22a). Vmaxis achieved when all enzyme molecules have bound the substrateand are engaged in catalysis (saturation). The Frenchmathematicians Michaelis and Menten developed a kineticequation to explain the behavior of most enzymes. They showedthat the maximal rate of an enzyme-catalyzed reaction (Vmax)depends on the concentration of enzyme (Fig. 3.22a) and therate constant for the rate-limiting step of the reaction. MM equation: x 1.0 Vmax [S] V0 = x [S] + KM x 0.5 x
Enzyme Kinetics: Substrate AffinityMichaelis and Menten also derived a kinetic constant, theMichaelis constant (KM), that is indicative of the affinity of mostenzymes for their substrates. The lower the KM the higher theaffinity of the enzyme for the substrate (Fig. 3.22b). The KMhappens to be the concentration of substrate at which thereaction rate is half-maximal. The concentrations of cellularmetabolites usually are set near the KMs of the enzymes thatcarry out their metabolism. This allows cells to respond tochanges in substrate concentration. 1/2 Vmax
Mechanism of Serine Proteases IProteases are enzymes that cleave peptide bonds in otherproteins. The serine proteases, which are important fordigestion and blood coagulation, contain reactive serine residuesin their catalytic sites. Also present are aspartate andhistidine residues that together with serine make up what iscalled the catalytic triad. The active sites of serine proteasesalso contain binding pockets that confer specificity bypositioning the peptide bond that is to be cleaved next to thereactive serine (Fig. 3.25a, trypsin). The digestive proteasestrypsin, chymotrypsin, and elastase select cleavage sites basedon the features of their binding pockets (Fig. 3.25b). Gly X Specificity Trypsin-basic aas Chymotrypsin-aromatic aas Elastase-small side-chain aas
Mechanism of Serine Proteases IIIn the serine protease reaction mechanism, an acyl enzymeintermediate is formed transiently after peptide bond cleavageby serine (Fig. 3.26). Subsequently, the acyl group is hydrolyzedoff the serine later in the reaction. Both acid-base catalysis(Steps a,c,d,& f) and transition state stabilization (Steps b & e)occur during the reaction. The reaction mechanism is inhibited atlow pH due to protonation of His-57 (inset). The pH optimum ofserine protease reactions therefore occurs at or slightly aboveneutrality.
Multifunctional EnzymesMost metabolic pathways occurvia multiple enzyme-catalyzedsteps. As illustrated in Fig.3.28, the rates of pathwayreactions can be increased ifthe substrates and productsof each step are channeled tothe next enzyme in thepathway. Channeling isenhanced in multisubunitenzyme complexes and byattachment of enzymes toscaffolds (Fig. 3.28b), oreven by fusion of encodedenzymes into a singlepolypeptide chain (Fig. 3.28c).
Regulating Protein Function by DegradationThe proteolytic degradation (turnover) of proteins is important forregulatory processes, cell renewal, and disposal of denatured anddamaged proteins. Lysosomes carry out degradation of endocytosedproteins and retired organelles.Cytoplasmic protein degradationis performed largely by themolecular machine called theproteasome. Proteasomesrecognize and degradeubiquinated proteins (Fig.3.29). Ubiquitin is a 76-amino-acid protein that afterconjugation to the protein,targets it to the proteasome.In ATP-dependent steps, theC-terminus of ubiquitin iscovalently attached to a lysineresidue in the protein.Polyubiquitination then takesplace. The proteasomedegrades the protein topeptides, and released ubiquitinmolecules are recycling.
Regulating Function by Ligand BindingThe binding of a ligand to aprotein typically triggers anallosteric ("other shape")conformational change resultingin the modification of itsactivity. An overview ofregulation via allosterictransitions is presented here inthe context of the tetramericO2 binding protein, hemoglobin(Hb). As shown in Fig. 3.30,the O2 binding curve for Hbdoes not show the simplehyperbolic shape exhibited byproteins that bind a ligand withthe same affinity regardless of ligand concentration. Instead,the Hb O2-binding curve is sigmoidal which indicates that theaffinity for O2 molecules increases after the first 1 or 2 havebound. In this case, binding displays positive cooperativity.Negative cooperativity is observed with other protein-ligandsystems. The reduced O2 binding affinity of Hb at low O2tensions favors release of O2 to peripheral tissues.
Calmodulin-mediated SwitchingMany proteins play switchingfunctions in cell signaling. Calciumion (Ca2+) is a very importantmessenger in cell signaling. Cellsmaintain cytoplasmic calciumconcentration at about 10-7 M.When calcium concentration risesabove this level due to hormone-receptor signaling processes, etc., Ca2+it binds to a protein known ascalmodulin (Kd = 10-6 M) triggeringconformational changes that resultin its activation. Calmodulincontains 4 helix-loop-helix motifs(EF hands) each of which can bindcalcium (Fig. 3.31). Calciumbinding causes a major allosterictransition in calmodulin. In itsalternate conformation, calmodulinbinds to target proteins, changingtheir activity.
GTPase-mediated SwitchingProteins belonging to the GTPase superfamily, such as Ras and Gproteins, serve as guanine nucleotide-dependent regulatoryswitches that control of the activity of specific target proteins(Fig. 3.32). When bound to GTP, these proteins adopt an activeconformation that modulates target protein function. When boundto GDP, their activity is turned off. The time-frame of activationdepends on the intrinsic GTPase activity (the timer function) ofthese proteins. In addition, GTP and GDP binding (and thusactivity) may be regulated by other factors. Examples of suchregulation will be covered later. Target protein function
Regulation by Kinase/Phosphatase Switching Protein function also can be regulated by allosteric transitions caused by covalent modification via phosphorylation (Fig. 3.33). Phosphorylation typically occurs on serine, threonine, and tyrosine residues. Enzymes known as kinases carry out phosphorylation. Their activity is opposed by phosphatases, which hydrolyze phosphates off of the modified amino acid. Some proteins are turned on by phosphorylation; others are turned off.