DNA sequences of genes are rarely of any functional value alone. It is the proteins that they encode that are important to the organism. The process of reading the code in DNA and converting that code into a functional protein is highly conserved across almost all branches of life. An RNA-based copy of a gene’s DNA sequence on a chromosome is constructed by a molecule called RNA polymerase through a process called transcription. This RNA molecule is then read by ribosomes, which manufacture amino acids and assemble them into amino acid sequences. This latter process is known as translation. To summarize: DNA sequences are transcribed into RNA sequences, which are then translated into proteins.
A gene sequence is not simply a series of codons. Instead, there are several key components. Promoter sequences assist the RNA polymerase in attaching itself to the DNA sequence template. Once the DNA sequence is transcribed, processing still remains. One of the most unexpected findings in the history of molecular genetics was the discovery that genes are split into pieces. Exons composed of codons are often interrupted by intron sequences that do not encode amino acids. Before translation can occur, the intron sequences must be spliced out of the RNA. The exons are then reassembled for translation into proteins.
Here we see a representation of the steps involved in creating a protein from a DNA sequence.
Introduction to Bioinformatics Shivani Chandra The Birla Institute of Scientific Research
Bioinformatics : is the development and use of computer applications for the Analysis , Interpretation , Simulation and Prediction of biological Systems and corresponding experimental methods in nature sciences.
browser for the major divisions of living organisms
(archaea, bacteria, eukaryota, viruses)
taxonomy information such as genetic codes
molecular data on extinct organisms
Question #1: How can I use PubMed at NCBI to find literature information?
PubMed is the NCBI gateway to MEDLINE. MEDLINE contains bibliographic citations and author abstracts from over 4,000 journals published in the United States and in 70 foreign countries. It has 12 million records dating back to 1966.
MeSH is the acronym for "Medical Subject Headings." MeSH is the list of the vocabulary terms used for subject analysis of biomedical literature at NLM. MeSH vocabulary is used for indexing journal articles for MEDLINE. The MeSH controlled vocabulary imposes uniformity and consistency to the indexing of biomedical literature.
PubMed search strategies Try the tutorial (“education” on the left sidebar) Use boolean queries lipocalin AND disease Try using “limits” Try “LinkOut” to find external resources Obtain articles on-line via Welch Medical Library (and download pdf files): http://www.welch.jhu.edu/
GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA sequences .
A new release is made every two months. GenBank is part of the International Nucleotide Sequence Database Collaboration , which is comprised of the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI.
These three organizations exchange data on a daily basis.