HBC1019 Biochemistry 1 Trimester 1, 2010/2011
Page 1 of 8
Faculty of Information ScienceTechnology
LAB REPORT
HBC 1019 - B...
HBC1019 Biochemistry 1 Trimester 1, 2010/2011
Page 2 of 8
Introduction
Biological databases are always referred as sequenc...
HBC1019 Biochemistry 1 Trimester 1, 2010/2011
Page 3 of 8
5. Analyzing DNA sequence
You will learn how to analyze a given ...
HBC1019 Biochemistry 1 Trimester 1, 2010/2011
Page 4 of 8
What are the results of translation? Identify the reading frame ...
HBC1019 Biochemistry 1 Trimester 1, 2010/2011
Page 5 of 8
6. Sequence homology
You will use BLAST to look for sequences th...
HBC1019 Biochemistry 1 Trimester 1, 2010/2011
Page 6 of 8
c. What are Score and E-value stand for? Use the BLAST online tu...
HBC1019 Biochemistry 1 Trimester 1, 2010/2011
Page 7 of 8
HBC1019 Biochemistry 1 Trimester 1, 2010/2011
Page 8 of 8
Upcoming SlideShare
Loading in...5
×

Practical 7 dna, rna and the flow of genetic information5

128

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
128
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Practical 7 dna, rna and the flow of genetic information5

  1. 1. HBC1019 Biochemistry 1 Trimester 1, 2010/2011 Page 1 of 8 Faculty of Information ScienceTechnology LAB REPORT HBC 1019 - Biochemistry I Practical 7 DNA, RNA and the Flow of Genetic Information Name : Osama Barayan ID : 1091105869
  2. 2. HBC1019 Biochemistry 1 Trimester 1, 2010/2011 Page 2 of 8 Introduction Biological databases are always referred as sequence or structure libraries that contained huge amount of information about the sequence and structure of nucleic acids (DNA, RNA) and proteins. This practical will introduce to you some of the relevant databases. There are very useful and becoming important resources for the study of biochemistry and bioinformatics as well at all levels. Finding databases a. What are the major online databases that contain DNA and protein sequences?0 1. http://www.ncbi.nlm.nih.gov/ 2. http://www.cellbiol.com/ 3. http://www.biochemweb.org/ 4. http://nar.oxfordjournals.org/ a. Which databases contain entire genomes? We can find many sites in the internet for example http://www.ncbi.nlm.nih.gov/ b. Define and understand the meaning of the following terms; once you defined them, please provide the link(s) as well. i. BLAST Basic Local Alignment Search Tool, or BLAST, is an algorithm for comparing primary biological sequence information, such as the amino-acid sequences of different proteins or the nucleotides of DNA sequences. ii. Taxonomy the science of the classification of living things, grouped by similarity: species are grouped into genera, genera into families, families into orders, orders into classes, classes into phyla, and phyla with similar characteristics at the top level of the classification . Gene ontology The Gene Ontology, or GO, is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all specie iii. Phylogenetic tree A phylogenetic tree or evolutionary tree is a branching diagram or "tree" showing the inferred evolutionary relationships among various biological species or other entities based upon similarities and differences in their physical and/or genetic characteristics iv. Multiple sequence alignment A multiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, generally protein, DNA, or RNA.
  3. 3. HBC1019 Biochemistry 1 Trimester 1, 2010/2011 Page 3 of 8 5. Analyzing DNA sequence You will learn how to analyze a given DNA sequence by identify an open reading frame, determine the protein that it will express and find the bacterial source for that protein. This is the DNA sequence: TACGCAATGCGTATCATTCTGCTGGGCGCTCCGGGCGCAGGTAAAGGTACTCAGGCTCAATTCATC ATGGAGAAATACGGCATTCCGCAAATCTCTACTGGTGACATGTTGCGCGCCGCTGTAAAAGCAGGT TCTGAGTTAGGTCTGAAAGCAAAAGAAATTATGGATGCGGGCAAGTTGGTGACTGATGAGTTAGTT ATCGCATTACTCAAAGAACGTATCACACAGGAAGATTGCCGCGATGGTTTTCTGTTAGACGGGTTC CCGCGTACCATTCCTCAGGCAGATGCCATGAAAAAGAAGCCGGTATCAGTTGATTATGTGCTGGAG TTTGATGTTCCAGACGAGCTGATTGTTGAGCGCATTGTCGGCCGTCGGGTACATGCTGCTTCAGGC CGTGTTTATCACGTTAAATTCAACCCACCTAAAGTTGAAGATAAAGATGATGTTACCGGTGAAGAG CTGACTATTCGTAAAGATGATCAGGAAGCGACTGTCCGTAAGCGTCTTATCGAATATCATCAACAA ACTGCACCATTGGTTTCTTACTATCATAAAGAAGCGGATGCAGGTAATACGCAATATTTTAAACTG GACGGAACCCGTAATGTAGCAGAAGTCAGTGCTGAACTGGCGACTATTCTCGGTTAATTCTGGATG GCCTTATAGCTAAGGCGGTTTAAGGCCGCCTTAGCTATTTCAAGTAAGAAGGGCGTAGTACCTACA AAAGGAGATTTGGCATGATGCAAAGCAAACCCGGCGTATTAATGGTTAATTTGGGGACACCAGATG CTCCAACGTCGAAAGCTATCAAGCGTTATTTAGCTGAGTTTTTGAGTGACCGCCGGGTAGTTGATA CTTCCCCATTGCTATGGTGGCCATTGCTGCATGGTGTTATTTTACCGCTTCGGTCACCACGTGTAG CAAAACTTTATCAATCCGTTTGGATGGAAGAGGGCTCTCCTTTATTGGTTTATAGCCGCCGCCAGC AGAAAGCACTGGCAGCAAGAATGCCTGATATTCCTGTAGAATTAGGCATGAGCTATGGTTCAC a. What is an Open Reading Frame (ORF) and reading frame? any region of DNA or RNA where a protein could be encoded. There must be a string of nucleotides in which one of the three reading frames has no stop codons b. Try to find an ORF from the segment of DNA above by finding the first start codon and the first in frame stop codon. Basically, in bacteria, an open reading frame on a piece of mRNA almost always begins with AUG, which corresponds to ATG in the DNA segment that code for the mRNA. According to the standard genetic code, there are three Stop codons on mRNA: UAA, UAG, and UGA, which correspond to TAA, TAG, and TGA in the parent DNA segment. Here are the rules for finding an open reading frame in this piece of bacterial DNA: i. It must start with ATG. In this exercise, the first ATG is the start codon. ii. It must end with TAA, TAG, or TGA. iii. It must be at least 300 nucleotides long (coding for 100 amino acids). iv. The ATG start codon and the stop codon must be in frame. This means that the total number of bases in the sequence from the start to the stop codon must be evenly divisible by 3. c. Copy the entire sequence again and go to the Translate tool on the ExPASy server (http://www.expasy.org/tools/dnal.htm). Paste the sequence in the box and select “Verbose (“Met”, “Stop”, spaces between residues)” as the Output format and click on “Translate Sequence”.
  4. 4. HBC1019 Biochemistry 1 Trimester 1, 2010/2011 Page 4 of 8 What are the results of translation? Identify the reading frame that contain a protein (more than 100 continuous amino acids with no interruptions by a stop codon) and its name. Y A Met R I I L L G A P G A G K G T Q A Q F I Met E K Y G I P Q I S T G D Met L R A A V K A G S E L G L K A K E I Met D A G K L V T D E L V I A L L K E R I T Q E D C R D G F L L D G F P R T I P Q A D A Met K K K P V S V D Y V L E F D V P D E L I V E R I V G R R V H A A S G R V Y H V K F N P P K V E D K D D V T G E E L T I R K D D Q E A T V R K R L I E Y H Q Q T A P L V S Y Y H K E A D A G N T Q Y F K L D G T R N V A E V S A E L A T I L G Stop F W Met A L Stop L R R F K A A L A I S S K K G V V P T K G D L A Now change the Output format from the early page to “Compact (“M”, “-”, no spaces)”. Go to the same reading frame as before and copy the protein sequence (by one-letter abbreviations) starting with “M” for start codon. Paste the sequence in your answer. MRIILLGAPGAGKGTQAQFIMEKYGIPQISTGDMLRAAVKAGSELGLKAKEIM DAGKL VTDELVIALLKERITQEDCRDGFLLDGFPRTIPQADAMKKKPVSVDYVLEFDV PDELIVE RIVGRRVHAASGRVYHVKFNPPKVEDKDDVTGEELTIRKDDQEATVRKRLIEY HQQTAPL VSYYHKEADAGNTQYFKLDGTRNVAEVSAELATILG d. Now you will identify the protein and the bacterial source. Go to the NCBI BLAST page (http://www.ncbi.nlm.nih.gov/BLAST/). What are the different types of BLAST program and what are their functions? Nucleotide blast : Search a nucleotide database blastx : Search protein database using a translated nucleotide query Protein blast : Search protein database tblastn : Search translated nucleotide database using a protein query tblastx : Search translated nucleotide database using a translated nucleotide query You will do a simple BLAST search using your protein sequence, but you can do much more with BLAST. You are encouraged to try the Tutorials on the BLAST (http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/tut1.html). On the BLAST page, select “Protein-protein BLAST.” Enter your protein sequence in the “Search” box. Use the default values for the rest of the page and click on the “BLAST!” button. You will be taken to the “formatting BLAST” page. Click on the “Format!” button. You may have to wait for the results. Your protein should be the first one listed in the BLAST output.
  5. 5. HBC1019 Biochemistry 1 Trimester 1, 2010/2011 Page 5 of 8 6. Sequence homology You will use BLAST to look for sequences that are homologous to the protein that you identified in problem 2. a. Define homolog, ortholog and paralog. A homology in chemistry refers to a chemical compound from a series of compounds that differ only in the number of repeated structural units. A homolog is a special case of an analog. either of two or more homologous gene sequences found in different species is called ortholog either of a pair of genes that derive from the same ancestral gene is called_paralog b. Go to the NCBI BLAST page (http://www.ncbi.nlm.nih.gov/BLAST/) and choose “Protein-protein BLAST.” Paste your protein sequence into the “Search” box. Before clicking on the “BLAST!” button, narrow the search by kingdom. As you look down the BLAST page, you'll see an Options section under “choose search set” (followed by an empty box) or “select from:” key in “Eukaryota.” Now click on the “BLAST!” button. Click on the “Format!” button on the next page. Can you find a homologous sequence from yeast? YES (Hint: Use your browser's Find tool to search for the term “Saccharomyces.”) Note the Score and E value given at the right of the entry. Can you find a homologous sequence from humans? (Hint: Search for the term “Homo.”) Note its Score and E value. Yes ,,max 98% from Cytidylate kinase,,,,total 90.5,,, and E value is 4e-18. Cytidine monophosphate, Score is 90.1, query coverage 98%, and E value is 5e-18 UMP-CMP kinase isoform a, Score is 89.7, query coverage 98%, and E value is 6e-18. Most biochemists consider 25% identity the cutoff for sequence homology, meaning that if two proteins are less than 25% identical in sequence, more evidence is needed to determine whether they are homologs. Click on the Score values for the yeast and human proteins to see each sequence aligned with your query sequence and to see the percent sequence identity. Are the yeast and human sequences homologous to your query sequence? yes
  6. 6. HBC1019 Biochemistry 1 Trimester 1, 2010/2011 Page 6 of 8 c. What are Score and E-value stand for? Use the BLAST online tutorial (http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/information3.html) to discover the meaning. What is the difference between an identity and a conservative substitution? From the result of BLAST you gained, provide an example from the comparison of your sequence and a homologous sequence. Score = a measure of the similarity of the query to the sequence shown. E−value is a measure of the reliability of the S score. BLAST uses a substitution matrix to assign values in the alignment process, based on the analysis of amino acid substitutions in a wide variety of protein sequences. Make sure you understand the meaning of the term “substitution matrix.” What is the default substitution matrix on the BLAST page? BLOSUM62. What other matrices are available? PAM1, PAM250, PAM30, PAM70, BLOSUM45, BLOSUM80 What is the source of the names for these substitution matrices? PAM = Point Accepted Mutation. This matrix work by observing differences between closely related proteins. -BLOSUM = BLOck SUbstitution Matrix. Matrix that can calculate small changes in sequences which could happen during evolution process. This matrix works by using multiple alignments of evolutionarily divergent proteins Repeat the BLAST search in Problem 3(b) using a different substitution matrix. (Look for algorithm parameters). Do you find different answers?yes
  7. 7. HBC1019 Biochemistry 1 Trimester 1, 2010/2011 Page 7 of 8
  8. 8. HBC1019 Biochemistry 1 Trimester 1, 2010/2011 Page 8 of 8

×