Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Molecular Structures 2009


Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

Molecular Structures 2009

  1. 1. Jennifer Lyon, MS, MLIS Eskind Biomedical Library Vanderbilt University Medical Center [email_address]
  2. 2. Introduction <ul><li>The focus of this module will be on the molecular structures of proteins, though the structures may also include DNA/RNA, small molecules (i.e. drugs), and ions. </li></ul><ul><li>Why look at protein structures? </li></ul><ul><li>Proteins fulfill a wide range of biological functions which depend upon their three dimensional structures. </li></ul><ul><li>Understanding protein structures tells us about the relationship between protein structure and function, the relationships between different proteins, and the effect of alterations of protein structure on function. </li></ul><ul><li>Uses of this information include designing novel drugs that affect protein function in order to prevent or cure disease. </li></ul>
  3. 3.   Image from the National Human Genome Research Institute (NHGRI) Talking Glossary of Genetic Terms
  4. 4. Methods <ul><li>The two most widely used experimental methods for determining protein structures are: </li></ul><ul><li>X-Ray Crystallography </li></ul><ul><li>NMR (Nuclear Magnetic Resonance) Spectroscopy . </li></ul><ul><li>Scientists are working hard to develop computerized methods of predicting structures from protein sequences, but this remains a difficult challenge. The most effective form of computational analysis models a protein structure based on similarity to another protein structure that has been experimentally-derived. Conserved (functional) domains are particularly useful in this type of modeling. </li></ul>
  5. 5. Understanding X-Ray Crystallography <ul><li>For X-Ray Crystallography, a protein must be purified and crystallized into a static form. </li></ul><ul><li>X-Rays are beamed through the crystal, which causes them to be deflected and scattered by the atoms within the structure. </li></ul><ul><li>The resulting diffraction pattern is unique for a given molecule and computer analysis allows identification of each atom and its position. </li></ul><ul><li>Results are dependent upon how well-purified the protein is and how well it crystallizes. Some proteins require artificial conditions to be crystallized. </li></ul>
  6. 6. Validity – X-Ray Crystallography <ul><li>The validity of an X-Ray structure is numerically represented by its resolution (given in angstroms). The lower the resolution, the more complete and exact the structure. In general, a reliable structure will have a resolution of 2 angstroms or less. </li></ul>
  7. 7. Resolution - Examples <ul><li>Thank you to the NCBI’s Dr. Eric Sayers for these examples. </li></ul>
  8. 8. Understanding NMR Spectroscopy <ul><li>When a molecular is placed under a magnetic field and exposed to radio waves, the nucleus of each atom will resonate (or spin). The resonation of each atom creates a unique signal that can be read by a radio wave recorder. </li></ul><ul><li>The advantage to NMR is that it can be done while the protein is in solution, a more natural environment than crystallization. </li></ul><ul><li>Proteins are naturally flexible and tend to “wriggle.” </li></ul><ul><li>NMR structure records often consist of multiple structure models for the same molecule, as it naturally moves in space. </li></ul><ul><li>The primary disadvantage of NMR is the extensive time it takes; also it doesn’t work well for larger molecules. </li></ul>
  9. 9. NMR/X-Ray Structures
  10. 10. Summary of Methods X-Ray Provides the highest resolution. Requires crystallization of protein and usually gives only one model of structure. May be automated in future. Reliable resolutions for structures are around 2 angstoms or less. NMR Allows structure determination of protein in solution. Variability of solution conditions possible. Provides characterization of intrinsic protein motion in solution. Computation Simulates the action of the forces that act on each atom in a molecule of known composition and approximates structure. Produces theoretical models. Fast, but presently least reliable.
  11. 11. Remember… <ul><li>Regardless of the method used, no protein structure is properly represented by a single conformation. Every protein molecule has intrinsic motion and will shift and change based on molecular properties and environmental conditions. Also, the data used to generate the structure is never perfect. Therefore, a true protein structure is actually a set of models. When you view a single structure, you are actually viewing a snapshot of the protein in time. </li></ul>
  12. 12. Databases and Searching <ul><li>PDB (Protein DataBank) </li></ul><ul><li>MMDB (Molecular Modeling DataBase) </li></ul><ul><ul><li>AKA Entrez Structure </li></ul></ul>
  13. 13. The Protein Data Bank <ul><li>PDB (Protein Data Bank) is the international database of 3-D biological macromolecular structures. </li></ul><ul><li>It is maintained by a nonprofit organization, the Research Collaboratory for Structural Bioinformatics (RCSB), associated with Rutgers University, San Diego Supercomputer Center, and the Biotechnology Division of the National Institute of Standards and Technology. </li></ul><ul><li>It is a public free-access database that contains molecular structures, proteins and nucleic acids, primarily structures experimentally-derived by X-Ray crystallography and NMR. Data submitted to PDB is validated prior to complete entry. </li></ul><ul><li>Educational Resources are available on the PDB website. </li></ul>
  14. 14. The PDB Advanced Search <ul><li>The PDB database offers a Java-based advanced search facility. The browser must be Java-enabled for it to function. </li></ul><ul><li>Let’s go live: </li></ul><ul><li>Search for a human topoisomerase structure that was determined by X-Ray Crystallography </li></ul><ul><ul><li>How about finding one bound to DNA? </li></ul></ul>
  15. 15. Viewing Structures From PDB <ul><li>The PDB does not distribute software for molecular structure viewing , however they do provide a list of commonly used molecular graphics programs . </li></ul><ul><li>Several interactive viewers that do not require additional software to be downloaded and installed can be accessed easily from the Structure Summary page. These include: </li></ul><ul><li>  KiNG </li></ul><ul><li>Jmol </li></ul><ul><li>WebMol </li></ul><ul><li>QuickPDB </li></ul><ul><li>Protein Workshop </li></ul><ul><li>Other software programs that require installing additional software or special configurations of your web browser may also be used; usually by downloading a PDB coordinate file and then opening it in the software. </li></ul><ul><li>  </li></ul>
  16. 16. Introducing Entrez Structure <ul><li>Entrez Structures is the section of the NCBI’s Entrez system that allows the user to specifically search for molecular structure records in the Molecular Modeling DataBase (MMDB). </li></ul><ul><li>Access: </li></ul><ul><ul><li>From NCBI home page - click on 'structures' in top bar. </li></ul></ul><ul><ul><li>From alphabetical site map - click on structures </li></ul></ul><ul><ul><li>Links from other database records: structures link option; drop-down menu structures link </li></ul></ul><ul><li>Direct link: </li></ul>
  17. 17. MMDB (Molecular Modeling DataBase) <ul><li>When you search Entrez Structure, you are searching the MMDB: </li></ul><ul><li>The MMDB contains 3-dimensional molecular structures, primarily proteins and some nucleic acids. </li></ul><ul><li>It is a curated subset of a database called PDB (Protein DataBank). </li></ul><ul><li>All MMDB records have been checked, annotated, and stored in an ASN.1 format by the NCBI. </li></ul><ul><li>The structure coordinates in PDB/MMDB records have been obtained experimentally by scientists using X-Ray Crystallography and NMR. </li></ul><ul><li>The structural data in MMDB has been cross-linked with bibliographic information, the sequence databases, and the NCBI taxonomy. </li></ul>
  18. 18. Searching Entrez Structure <ul><li>Searching : </li></ul><ul><li>Use the search textbox on the top of the Entrez Structure webpage. </li></ul><ul><li>Entering a text query without a field identifier will automatically search ALL Fields </li></ul><ul><li>Specialized search fields for Structure include: </li></ul><ul><ul><li>Resolution (in angstroms) - [RESO] </li></ul></ul><ul><ul><li>The experimental method used (X-Ray, NMR, etc.) - [EXPM] </li></ul></ul><ul><ul><li>The PDB definition of a ligand in the PDB structure - [LNAM] </li></ul></ul><ul><ul><li>The number of protein chains in the PDB structure - [PCCNT] </li></ul></ul><ul><li>Note that the search results show a 4-digit identifier code for each record. This is actually the PDB ID code for the structure. </li></ul><ul><li>See the Structure Help Page for a complete list of fields available. </li></ul><ul><li>Unfortunately, the Structure Limits screen doesn’t provide much help. </li></ul>
  19. 19. Example – Searching <ul><li>Consider the following search: </li></ul><ul><li>Find the NMR structure for Mouse Lysozyme M. </li></ul><ul><li>How would you do this search? Try it. Here's an example of the type of search you might do: </li></ul><ul><li>mouse[orgn] AND NMR[EXPM] AND lysozyme M </li></ul><ul><li>Always familiarize yourself with the fields available for each Entrez database as the appropriate use of the field codes in your search will provide much more specific searching capability. </li></ul>
  20. 20. Practice Searching <ul><li>Some more searches to try (in both MMDB and PDB): </li></ul><ul><ul><li>Find the Rat Neuronal Nitric Oxide Synthase Oxygenase Domain Complexed With The Inhibitor Ar-R17477 </li></ul></ul><ul><ul><li>Find a recombinant hemoglobin structure with 3 protein chains </li></ul></ul><ul><ul><li>Find the Bovine Pancreas Beta-Trypsin In Complex With Benzamidine </li></ul></ul>
  21. 21. Structure Results (DocSum) Click on PDB identifier or photo to see more detailed records Description as provided by author Links to related structures and other related info
  22. 22. MMDB Structure Records <ul><li>See example: </li></ul><ul><li>Record includes </li></ul><ul><ul><li>Thumbnail photo of structure and links to Viewers </li></ul></ul><ul><ul><li>Identifiers for both MMDB and PDB </li></ul></ul><ul><ul><li>Description, Reference, Taxonomy information </li></ul></ul><ul><ul><li>Structure Components </li></ul></ul><ul><ul><ul><li>Protein Chains with conserved domains illustrated </li></ul></ul></ul><ul><ul><ul><li>Ligands and Ions </li></ul></ul></ul>
  23. 23. Comparing PDB and MMDB <ul><li>MMDB is a subset of PDB </li></ul><ul><li>MMDB is a part of the NCBI’s Entrez system </li></ul><ul><li>MMDB uses a ASN.1 format which contains value-added information (explicit chemical bonds and secondary structures) </li></ul><ul><li>PDB supports several data formats. There are some inconsistencies in data formats over time. </li></ul><ul><li>All structures are entered into PDB first and then are automatically drawn into MMDB on a monthly basis. PDB updates weekly. Therefore, there may be new structures in PDB that are not yet in MMDB </li></ul><ul><li>PDB allows more variability in choice of viewer software while MMDB is designed to be used with Cn3D (the NCBI’s viewer) </li></ul>
  24. 24. Viewing Structures <ul><li>There are a large number of structure viewing programs available free online </li></ul><ul><li>We will start with the NCBI’s Cn3D and then look at some of the others </li></ul><ul><li>Each of the programs offers some similar functionality and some unique features </li></ul>
  25. 25. Introducing Cn3D <ul><li>Cn3D is a helper application for a web browser ( i.e. Netscape or Internet Explorer) that allows viewing of 3-dimensional structures from NCBI's structure database. </li></ul><ul><li>Free download from the NCBI web site: Cn3D Download Page </li></ul><ul><li>Choose the correct operating system for your computer and click on the appropriate link: PC, Mac or UNIX. </li></ul><ul><li>Follow the directions to download the file and install it. </li></ul><ul><li>Cn3D should automatically install itself and connect to your browser </li></ul><ul><li>The NCBI also has a software package that includes Cn3D along with other functionality (phylogenetic and conserved domain analysis) called CDTree . This won’t be included in this module. </li></ul>
  26. 26. Using Cn3D <ul><li>Let’s go live! </li></ul><ul><li>Manipulating the Viewer </li></ul><ul><ul><li>Structure: 1REV (HIV reverse transcriptase) </li></ul></ul><ul><ul><li>Structure: 1DMSO (calmodulin) </li></ul></ul><ul><ul><li>Structure: 1GAT (C-terminal domain - zinc-finger DNA binding domain - of the Gata-1 protein bound to its DNA substrate) </li></ul></ul><ul><li>Interposing a Sequence Alignment on a Structure </li></ul><ul><ul><li>Align the Apolipoprotein IV sequence (NP_000473) with the structure of chain A of Apolipoprotein I (1AV1) </li></ul></ul>
  27. 27. What Cn3D Does (and Doesn’t) <ul><li>Cn3D will… </li></ul><ul><ul><li>import and show existing structure alignments from the NCBI’s database of structure alignments. </li></ul></ul><ul><ul><li>create BLAST alignments based on imported sequences. Note that BLAST alignments are always pair-wise </li></ul></ul><ul><ul><li>show CDD (Conserved Domain Database) alignments which unlike BLAST alignments are multiple sequence alignments. </li></ul></ul><ul><li>Cn3D will not… </li></ul><ul><ul><li>create a structure based on a sequence </li></ul></ul><ul><ul><li>do new searches of the structure database </li></ul></ul>
  28. 28. More About Structural Alignments <ul><li>Structural alignments can only be constructed and viewed if the structures of both molecules have already been experimentally determined and entered into MMDB. Structural alignments are constructed using a program called VAST (Vector Alignment Search Tool) . </li></ul>
  29. 29. VAST – Vector Alignment Search Tool <ul><li>VAST is a software program that compares three dimensional molecular structures for structural similarity. This is different from BLAST which does linear sequence alignments. As already noted, VAST comparisons are automatically done for every structure in MMDB and that information is made available through a link in each MMDB structure summary, the ‘VAST' link. Note that VAST can only compare known (experimentally-determined) 3D structures. </li></ul><ul><li>VAST does NOT do structure predictions for linear protein sequences. </li></ul>
  30. 30. Why Do Structural Alignments? <ul><li>To compare structures </li></ul><ul><li>To find homologs that sequence searches cannot: distant protein homologs often conserve structure more strongly than sequence </li></ul><ul><li>To explore protein evolution </li></ul><ul><li>To identify conserved core elements of a protein fold that can be used to model related proteins of unknown structure </li></ul>
  31. 31. Understanding 3-D Domains <ul><li>Thank you to the NCBI’s Dr. Eric Sayers for the example. </li></ul>
  32. 32. Identifying 3-D Domains <ul><li>Thank you to the NCBI’s Dr. Eric Sayers for the example. </li></ul>
  33. 33. Numbering 3-D Domains <ul><li>3D domains are identified by </li></ul><ul><li>PDB code for the structure they belong to </li></ul><ul><li>The chain designation (A, B, etc.) </li></ul><ul><li>The specific domain # (1, 2, etc.) </li></ul><ul><li>Therefore 1EJ9A3 means the 3 rd domain of the A chain of structure 1EJ9. </li></ul><ul><li>Remember: the purpose of 3D domains is structural comparison with VAST. They are purely based on structure and may have no relationship to sequence-based analysis. </li></ul>
  34. 34. How Does VAST Work? <ul><li>Thank you to the NCBI’s Dr. Eric Sayers for the example. </li></ul>
  35. 35. How Does VAST Work? (2) <ul><li>Thank you to the NCBI’s Dr. Eric Sayers for the example. </li></ul>
  36. 36. How Does VAST Work? (3) <ul><li>Thank you to the NCBI’s Dr. Eric Sayers for the example. </li></ul>
  37. 37. Let’s Go Live! <ul><li>Structural Alignments </li></ul><ul><ul><li>Compare the B chains of healthy Hemoglobin (1XZ2) with Sickle Cell Hemoglobin (2HBS) </li></ul></ul><ul><ul><li>Compare the Aquaporin 1, PDB ID# 1H6I (MMBD ID# 18385 ) Chain A with the Crystal Structure Of The E. Coli Glycerol Facilitator (Glpf) With Substrate Glycerol, PDB ID# 1FX8 (MMDB ID# 14737) Chain A </li></ul></ul><ul><ul><li>What proteins have similar structures to the Drosophila calmodulin (4CLN) protein? </li></ul></ul><ul><ul><li>What proteins have similar structures to the human protein leptin? </li></ul></ul><ul><ul><li>Compare two conformations of the same protein: 1YER with 1YES </li></ul></ul>
  38. 38. Other Structure Viewers <ul><li>Find the following structures in PDB and then manipulate them with the available viewer programs </li></ul><ul><ul><li>Crystal structure of the cholera toxin from bacteria, Vibrio cholerae </li></ul></ul><ul><ul><li>1.70 angstrom crystal structure of human mitochondrial ferritin </li></ul></ul><ul><ul><li>The H1 hemagglutinin protein from the 1918 influenza virus as published by Gamblin, et al. in Science v303 pp. 1838-42, 2004 </li></ul></ul><ul><ul><li>The structure of a zinc finger domain bound to a DNA substrate </li></ul></ul><ul><ul><li>Structure of the beta amyloid A4 (40 residue) peptide involved in Alzheimer’s Disease </li></ul></ul><ul><ul><li>Structure of Carbonic Anhydrase I bound to a Sulfonamide drug </li></ul></ul><ul><ul><li>Structure of bovine rhodopsin </li></ul></ul><ul><li>Try finding them in Entrez Structure and view with Cn3D for comparison </li></ul>