2. Proteins, structures and how to get them.
• Protein production
• Protein structure determination by X-ray crystallography
A very brief introduction to:
3. Proteins as drug targets
• Proteins are the targets of the vast majority of
marketed drugs.
• Target-based drug development therefore requires
an good supply of high quality, purified proteins for:
Assay development
High throughput screening
Biophysical analysis eg SPR, ITC
Structure determination eg X-ray, NMR, Cryo-EM
• Proteins themselves are also used as important
biological therapeutics eg, mAbs, vaccines, TNF
4. Protein production: Expression systems
• Historically proteins were obtained from natural
sources eg. animal organs, blood, plants.
• Proteins today are generated recombinantly.
• Synthetic & codon optimised genes for the target
protein cloned into a vector or plasmid &
transfected into cultured cells for overexpression.
• Common expression cell systems:
E.coli eg. BL21(DE3)
Baculovirus/insect cells eg. Sf9 or Sf21
Mammalian cells eg. HEK 293
• Depending on needs and yield grow volumes can
vary from 10’s ml to 1000’s litres.
5. Protein production: Construct design
• Protein construct design is often the critical step in generating proteins as required.
• Optimisation of protein constructs involves careful choice of:
What form of the protein is required? eg. full-length, catalytic/functional domain,
protein in complex, PTM’s (phosphorylation, glycosoylation) required for function?
Which expression system vectors to use? eg. E.coli, insect, mammalian, yeast.
Affinity tags/fusion proteins +/- protease cleavage sites, to
i) enable ease of purification eg. 6His, GST, stepavidin.
ii) improve solubility eg. MBP, SUMO
iii) enable assay eg. Avi-tag + biotioylation
Reduction of structural heterogeneity & disorder – important for X-ray crystallography
eg. N & C-term truncations, deletion of mobile domains or loops.
6. Protein production: Purification
• Purification is carried out mostly by liquid chromatography
methods based on the physical properties of the proteins.
• Affinity chromatography – separation based on affinity of
protein itself or to fused tags eg. His6, GST, MBP, FLAG
(tags often removed subsequently by protease treatment)
• Ion-exchange chromatography – separation based on
protein charge.
• Size-exclusion chromatography – separation based on
molecular size.
• Purification of membrane proteins (eg GPCR’s) can be very
difficult due to their hydrophobicity and their instability out of
membrane environment. Requires careful use of lipids &
detergents and often stabilising mutations.
7. Protein production: Quality Control
• SDS gels – gives indication of purity & approx. molecular size of
denatured protein.
• A280 – gives good estimate of protein concentration.
• Analytical size-exclusion chromatography - molecular size of
protein or complex in solution.
• Mass Spectrometry (MS):
i) Intact MS – accurate molecular mass + PTM’s
ii) Peptide-mass fingerprinting – confirmation of protein ID.
• Functional assay – Is the protein catalytically active?
8. Protein structure: Reasons for protein structure
• Once a purified protein is available we may
want to determine it’s structure – X-ray
crystallography, Cryo-EM, NMR.
• Protein structures themselves can inform on
function.
• High-resolution structures (≤ 2.5 Å) of protein-
ligand complexes enables structure-based
ligand: explain SAR, aid chemistry design.
• Protein structures are essential for successfully
progressing fragment-based ligand-design
(FBLG) projects.
9. Protein crystallography: X-rays
1nm 10nm 100nm 1mm 10mm 0.1 mm
atomic protein
bonds proteins cell crystals
limit of light
microscope (0.4mm)
wavelength of X-rays
(0.15nm = 1.5Å)
N
O
H
• X-ray crystallography is an experimental method that exploits the fact that X-rays can be
diffracted by the periodic assembly of atoms or molecules in a crystal.
• X-rays have the appropriate wavelength (1-2 Ång) to be scattered by the electron clouds
of atoms of comparable size.
• By recording the X-ray diffraction patterns the electron density within the crystal can be
reconstructed mathematically.
11. Protein crystallography : Crystallisation
• There is no way to predict the conditions that will
crystallise a protein from it’s amino acid sequence.
• Crystallisation screening : Empirical process of
incubating large numbers of cocktails of precipitants,
buffer and salts with the protein of interest to try and
find conditions that produce crystals that diffract well.
• Usually carried out robotically in 96-well plates using
volumes of few 100nl per well.
• Common methods: vapour diffusion, microbatch and
microdialysis.
“Think of it as finding a recipe where we have no idea what ingredients we need are”
13. Protein crystallography : Diffraction expt.
• Generally the only information obtained from a diffraction image is
i) the position & ii) the intensity of the diffraction maxima ie. spots or reflections.
• This is not sufficient information to reconstruct the electron density within the crystal.
• Also requires phase angle of each reflection – but this is not directly observed in diffraction images.
• This is the so called “Phase problem”.
14. Protein crystallography: Phases
The necessary phase info can be obtained in a number of ways:
• Molecular Replacement (MR): Use structure of
similar/homologous proteins (>30% sequence ID):
• Multiple Isomorphous Replacement (MIR). Soak atoms of
heavy metals into crystals– eg. U, Hg, Au salts etc:
• Multiple Anomolous Dispersion (MAD). Replace Met
residues with selenoMet when expressing the protein & tune
X-ray wavelength to selenium absoption edge.
• Sulphur SAD: Rely on native sulphur atom anomolous signal.
15. Protein crystallography : Electron density to model
Calculate 3D electron
density map
Build atomic model into
electron density
Refine model data against experimental data
w Fo -k Fcå / FoåR-factor =
R-free: cross validation value
16. Protein crystallography : Validation/QC
Quality of Model :
• Stereochemistry ie. bond lengths, angles – rms deviations
• Main chain dihedral angles f & y: Ramachandran plots –
number of violations
• Side chain rotamers – no. of deviations from preferred
conformations.
• Planarity of aromatic rings and peptide groups – rms
• MolProbity clash scores – number of close non-bonding
contacts/1000
Quality of Experimental X-ray Data:
• Resolution (<3.0Ang)
• Completeness (>90%)
• Redundancy (>5x)
• Signal-to-noise (I/sI > 2)
• Merging statistics eg. Rmeas, Rpim (≤10%), CC1/2
18. Protein production: Construct design strategy
• Often best strategy is to work in parallel with as many
constructs as practicable or affordable.
• Constructs can initially be assessed quickly at small
scale (~150ml) by expression level– good surrogate of
protein behavior.
• Once purified can score by catalytic/functional
activity or by protein stability (thermal shift assay).
• If required can use results to refine the constructs in
subsequent rounds of design.
• Repeat until successful!