SlideShare a Scribd company logo
1 of 37
Download to read offline
P a g e | 1
BGE 313
Bioinformatics Laboratory
Submitted To, Submitted By,
Dr. Siraje Arif Mahmud Effat Jahan Tamanna
Associate Professor Roll No: 131647
Dept. of Biotechnology and Genetic Engineering Reg No: 36401
Jahangirnagar University Session: 2012 – 13
P a g e | 2
INDEX
Serial
No
Date Name of the Experiment Page Remarks
01 21.04.2016 Searching basics, AND, OR, NOT,
“keywords together”, *
3 – 8
02 21.04.2016 Searching PMC and PubMed Using
Authors name, fields, limits
8 – 10
03 02.05.2016 Retrieving protein sequences using
UniProt and creating multi-fasta files
11
04 02.05.2016 Retrieving relevant DNA sequences
using nucleotide and creating multi
fasta-file. (search by ldh1 NOT
hypothetical)
12
05 02.05.2016 Performing DNA and protein BLAST
and analyzing result
13 – 16
06 03.05.2016 Pairwise alignment (global, end gap
free), calculate identities, dotplot
using BioEdit
17 – 20
07 03.05.2016 Nucleotide composition, complement,
reverse complement, DNA to RNA,
translate, restriction map, six frame
translation using BioEdit
21 – 27
08 03.05.2016 Multiple sequence analysis using
BioEdit
27 – 28
09 03.05.2016 Tree Generation with MEGA 29 – 32
10 09.05.2016 Working with single protein sequence:
Analyzing protein composition
(pepdigest, pepstats), Protein
secondary structure by mEmboss:
(garnier for protein secondary
structure), helixturnhelix for motifs,
pepcoil for coiled coil regions
34 – 36
11 09.05-2016 RNA structure prediction using
RNAstructure
36 – 37
P a g e | 3
Experiment No 01
Searching basics, AND, OR, NOT, “keywords together”, *
i. Searching Basics
Methods:
 Open PubMed home page
 In PubMed search box write alpha amylase and click search. 10639 results will be shown.
 To filter the result click on free full text from text availability section. Results will be reduced into
3108 in number.
 Then click on 5 years from Publication Dates section. Rest will be reduced into 792 in number.
 Click on Review from Article type section. Rest will be reduced into 16 in number.
 If we want to clear filter, we have to click clear on the right side of all filter type, or clear all.
Result:
Interpretation:
PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts
on life sciences and biomedical topics. PubMed is maintained and updated by the National Library of
Medicine on a weekly basis. A search on alpha amylase shows results all articles related to alpha
amylase. If we want to filer results we can click on review which will show reviewed articles. Full free
text will reduce result by filtering free articles and 5 years will reduce results by showing articles which
were published in previous 5 years.
P a g e | 4
ii. Boolean Operator Using
a. AND
Methods:
 Open PubMed home page
 Write gyrase AND topoisomerase and search. 1986 results are found.
 Then filtering. Click Free full text, Review and 5 years. Results will be reduced into 1032, 31 and 6.
Result:
Interpretation:
AND requires both terms to be in each item returned. If one is contained in the document and the other is
not, the item is not included in the resulting list (narrows the search). A search on gyrase AND
topoisomerase includes results that all articles will include both keywords.
b. OR
Methods:
 Open PubMed home page
 Search antibody OR immunoglobulin.
 1247231 results are found.
 Now filtering. Free full text – 363108, Review - 17602 , 5 years – 7068
P a g e | 5
Result:
Interpretation:
Either term (or both) will be in the returned document (Broadens the search). Search on antibody OR
immunoglobulin includes results contains that the articles containing the word antibody (but not
immunoglobulin) and other articles containing the word immunoglobulin (but not antibody) as well as
articles with antibody OR immunoglobulin in either order or number of uses.
c. NOT
Methods:
 Open PubMed home page
 Search immunoglobulin NOT IgG NOT IgA NOT IgM
 693211 results are found.
 Now filtering
 Free full text - 181743
 Review - 10047
 5 years – 3618
P a g e | 6
Result:
Interpretation:
When the first term is searched, then any records containing the term after the operators are subtracted
from the results. A search on immunoglobulin NOT IgG NOT IgA NOT IgM includes results contains
that the articles about immunoglobulin will exclude IgG, IgA and IgM.
iii. Inverted (“ ”) search
Methods:
 Open PubMed home page
 Search “alpha amylase”
 8015 results are found
 Now Filtering
 Free full text – 2393
 Review – 25
 5 years – 12
P a g e | 7
Result:
Interpretation:
When any term is searched, then any records containing the term exactly will be shown. A search on
“alpha amylase” includes results contains that the articles will contain the word exactly and the result will
be more specific.
iv. * search
Methods:
 Open PubMed home page
 Search ldh*
 25235 results are found
 Now filtering
 Free full text – 5691
 Review – 96
 5 years – 46
P a g e | 8
Result:
Interpretation:
When any term is searched with *, then the records containing all the subclasses of the term exactly will
be shown. A search on ldh* includes results contains that the articles will contain all the subclasses of
ldh.
Experiment No 02:
Searching PMC and PubMed Using Authors name, fields, limits
i. Searching PubMed using author name
Methods:
 Open PubMed home page
 Write Schilling CH (1999) in search box and search.
 2 results are found
P a g e | 9
Result:
Interpretation:
PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts
on life sciences and biomedical topics. PubMed is maintained and updated by the National Library of
Medicine on a weekly basis. Search on Schilling CH (1999) shows the result contain the free article by
the author Schilling CH in 1999.
ii. Searching PMC using author name
Methods:
 Open PubMed home page
 Select PMC.
 Write Schilling CH in search box and search.
 9 results are found
P a g e | 10
Result:
Interpretation:
PubMed Central is a free digital archive of articles, accessible to anyone from anywhere via a basic web
browser. The full text of all PubMed Central articles is free to read, with varying provisions for reuse.
Search on Schilling CH shows the result containing the free article by the author Schilling CH.
P a g e | 11
Experiment No 03:
Retrieving protein sequences using UniProt and creating multi-fasta files
Methods:
 Open UniProt home page.
 Search by writing ldh1.
 Filter this by clicking Reviewed (893) from filter by option.
 In left side in the box of other organisms write lactobacillus and click go. 16 results will be shown.
 Now click on the box of Entry for selecting all.
 Now click on download → uncompressed → go
 Select all sequences. (Ctrl + A)
 Copy all sequences. (Ctrl + C)
 Open Notepad.
 Paste All Sequences (Ctrl + V)
 Now save these sequences. Click file → save as ( Ctrl + S) → select location → write ldh1P.fasta in
file name → click save
Result:
Interpretation:
UniProt is the Universal Protein resource, a central repository of protein data created by combining the
Swiss-Prot, TrEMBL and PIR-PSD databases. We can search any protein sequence from UniProt. We can
create multi fasta files and save them in notepad. We can further use this fasta files when we need them.
P a g e | 12
Experiment No: 4
Retrieving relevant DNA sequences using nucleotide and creating multi fasta-
file. (Search by ldh1 NOT hypothetical)
Methods:
 Open PubMed Home Page.
 Select nucleotide.
 Search by writing ldh1 NOT hypothetical. 51 results will be found.
 Select Number 4, 5, 6, 9.
 Select on the right arrow of Summary. From here select FASTA (text).
 Select all sequences (Ctrl + A).
 Copy all sequences (Ctrl + C).
 Open Notepad.
 Paste All Sequences (Ctrl + V)
 Now save these sequences. Click file → save as ( Ctrl + S) → select location → write ldh1.fasta
in file name → click save
Result:
Interpretation:
We can search any nucleotide sequence from PubMed. We can create multi fasta files and save them in
notepad. We can further use this fasta files when we need them.
P a g e | 13
Experiment No: 05
Performing DNA and protein BLAST and analyzing result
i. Blastn
Methods:
 Open blast home page.
 Select nucleotide blast
 Copy a nucleotide sequence from previously saved ldh1.fasta file.
 Paste the sequence into Enter accession number(s), gi(s), or FASTA sequences(s) box.
 From Choose Search Set options select nucleotide collection (nr/nt).
 From Algorithm parameters select: Max target sequences (50)→ Expect threshold (0.1)
 Then click show results in new window and click BLAST.
Methods:
ii. Blastp
Methods:
 Open blast home page.
 Select nucleotide blast.
 Select blastp.
 Copy a protein sequence from previously saved ldh1P.fasta file.
 Paste the sequence into Enter accession number(s), gi(s), or FASTA sequences(s) box.
 From Choose Search Set options select non-redundant protein sequences (nr).
 From Algorithm parameters select: Max target sequences (50)→ Expect threshold (0.1)
 Then click show results in new window and click BLAST.
P a g e | 14
Results:
iii. blastx
Methods:
 Open blast home page.
 Select nucleotide blast.
 Select blastx.
 Copy a nucleotide sequence from previously saved ldh1.fasta file.
 Paste the sequence into Enter accession number(s), gi(s), or FASTA sequences(s) box.
 From Choose Search Set options select non-redundant protein sequences (nr).
 From Algorithm parameters select: Max target sequences (50) → Expect threshold (0.1).
 Then click show results in new window and click BLAST.
Results:
P a g e | 15
iv. tlastn
Methods:
 Open blast home page.
 Select “nucleotide blast”
 Select “tblastn”.
 Copy a protein sequence from previously saved “ldh1P.fasta” file.
 Paste the sequence into “Enter accession number(s), gi(s), or FASTA sequences(s)” box.
 From “Choose Search Set” options select “nucleotide collection (nr/nt)”.
 From “Algorithm parameters” select: Max target sequences (50)→ Expect threshold (0.1)
 Then click “show results in new window” and click “BLAST”
Results:
P a g e | 16
v. tlastx
Methods:
 Open blast home page.
 Select “nucleotide blast”
 Select “tblastx”.
 Copy a nucleotide sequence from previously saved “ldh1.fasta” file.
 Paste the sequence into “Enter accession number(s), gi(s), or FASTA sequences(s)” box.
 From “Choose Search Set” options select “nucleotide collection (nr/nt)”.
 From “Algorithm parameters” select: Max target sequences (50)→ Expect threshold (0.1)
 Then click “show results in new window” and click “BLAST”
Interpretation:
The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The
program compares nucleotide or protein sequences to sequence databases and calculates the statistical
significance of matches. blastn- Search a nucleotide database using a nucleotide query, blastp- Search
protein database using a protein query, blastx- Search protein database using a translated nucleotide
query, tblast- Search translated nucleotide database using a protein query, tblastx- Search translated
nucleotide database using a translated nucleotide query. During the result of blast, in graphic summary
the red portion indicates the query coverage. From blast result we can download fasta files.
P a g e | 17
Experiment No- 06:
Pairwise alignment (global, end gap free), calculate identities, dotplot using BioEdit
i. Pairwise alignment- global
Methods:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1.fasta file.
 Select two sequences.
 Click on sequence → pairwise alignment → align two sequences (optimal GLOBAL alignment)
 Can click “Shade identities and similarities in alignment window” for color shade or “normal
view” mode for normal view.
Results:
Interpretation:
Global alignments, which attempt to align every residue in every sequence, are most useful when the
sequences in the query set are similar and of roughly equal size. In BioEdit we aligned to sequences
globally. By color shading we noticed identities and similarities in alignment window.
P a g e | 18
ii. Pairwise alignment- end gap free
Methods:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1.fasta file.
 Select two sequences.
 Click on sequence → pairwise alignment → align two sequence (allow ends to slide)
 Can click “Shade identities and similarities in alignment window” for color shade or “normal
view” mode for normal view.
Results:
Interpretation:
In end gap free alignment Gaps that appear before the first or after the last letter of the sequence are for.
Especially preferable whenever one of the sequences is significantly shorter than the other. We used two
sequences to align end gap free. By color shading we observed identities and similarities in alignment
window.
P a g e | 19
iii. Calculate identities
Methods:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1.fasta file.
 Select two sequences.
 Click on sequence → pairwise alignment → calculate identity/similarity for two sequences.
Interpretation:
Sequence identity is the amount of characters which match exactly between two different sequences.
Hereby, gaps are not counted and the measurement is relational to the shorter of the two sequences. In this
experiment, we found identities between two sequences by BioEdit.
iiv. Dotplot
Methods:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1.fasta file.
 Select two sequences.
 Click on sequence → pairwise alignment → Dot plot (Pairwise comparison)
P a g e | 20
Result:
Window – 20, mismatch limit - 10
Interpretation:
Dot plot is a graphical method that allows the comparison of two sequences and identify regions of close
similarity between them. The convenience of using dot-plot analysis is that the one graphics shows all
significant pairwise alignments simultaneously. We constructed a dot plot window using BioEdit. We can
see a lot of dots along a diagonal line, which indicates that the two protein sequences contain many
identical amino acids at the same (or very similar) positions along their lengths. This is what we would
expect, because we know that these two proteins are homologues (related proteins).
P a g e | 21
Experiment No – 07
Nucleotide composition, complement, reverse complement, DNA to RNA,
translate, restriction map, six frame translation using BioEdit.
i. Nucleotide Composition
Methods:
 Open BioBdit
 Click file → open → Select All Files (*.*) from files of type
 Open ldh1.fasta file
 Select one sequence
 Click on sequence → nucleic acid → nucleotide composition
Result:
Interpretation:
Nucleotide composition summaries and plots may be obtained by choosing “Nucleotide Composition”
form the “Nucleic Acid” submenu of the “Sequence” menu, respectively. Bar plots show the Molar
percent of each residue in the sequence. For nucleic acids, degenerate nucleotide designations are added
to the plot if and as they are encountered. For example, a sequence that has only A, G, C and T will have
four bars on the graph. We can observe molecular weight, A+T content, G+C content.
P a g e | 22
ii. Complement
Methods:
 Open BioEdit.
 Click file → open.
 Select All Files (*.*) from files of type.
 Open ldh1.fasta file.
 Select one sequence.
 Click on sequence → nucleic acid → complement.
 For undo select the sequence →Click on sequence → nucleic acid → complement.
Result:
Interpretation:
We can get the complement sequence of the given sequence. If we want to get back the previous
sequence, we have to complement the sequence again.
iii. Reverse Complement
Methods:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1.fasta file
 Select one sequence.
 Click on sequence → nucleic acid → reverse complement
 For undo select the sequence →Click on sequence → nucleic acid → reverse complement
P a g e | 23
Results:
Interpretation:
We can get the reverse complement sequence of the given sequence. If we want to get back the previous
sequence, we have to reverse complement the sequence again.
iv. DNA to RNA
Methods:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1.fasta file.
 Select one sequence.
 Click on sequence → nucleic acid → DNA - > RNA
 For undo select the sequence →Click on sequence → nucleic acid → RNA -> DNA
Result:
Interpretation:
DNA sequence converts into RNA sequence. In RNA sequence there is no thymine (T), Instead of
thymine there is Uracil (U).
P a g e | 24
v. Translate
Methods:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1.fasta file.
 Select one sequence.
 Click on sequence → nucleic acid → translate → frame 1, frame 2, frame 3
 To get the remaining 3 frames select the sequence →nucleic acid → reverse complement →
again select sequence → nucleic acid → translate → frame 1, frame 2, frame 3
Result:
P a g e | 25
Interpretation:
We know that there are six frames. Three are forward frames and three are reverse frames. Selecting a
sequence and clicking by frame 1, frame 2, frame 3 we can get all forward frames. We can also observe
that every three nucleotides code which amino acid. By reverse complement of a sequence we can get
remaining three reverse frame and every three nucleotides code the specific amino acid. In the experiment
we got 3 forward and 3 reverse frames of a selected sequence.
vi. Restriction Map
Methods:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1.fasta file.
 Select one sequence.
 Click on sequence → nucleic acid → restriction map → cancel enzyme with degenerate
recognition and large recognition sites → select all enzymes from manufacturer → select circular
DNA (ends joint) → generate map
Results:
P a g e | 26
Interpretation:
A restriction map is a map of known restriction sites within a sequence of DNA. Restriction mapping
requires the use of restriction enzymes. Restriction Map accepts a DNA sequence and returns a textual
map showing the positions of restriction endonuclease cut sites. From this map which we found in
experiment show different restriction site which are cut by different restriction enzyme.
vii. Six frame translation
Sorted six frame translation:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1.fasta file.
 Select one sequence.
 Click on sequence → nucleic acid → sorted six frame translation → minimum OFR size 40→
start codon ATG → translate
Result:
Unsorted six frame translation:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1.fasta file.
 Select one sequence.
 Click on sequence → nucleic acid → unsorted six frame translation → minimum OFR size 40 →
start codon ATG → translate
P a g e | 27
Interpretation:
A DNA sequence may be translated in all six reading frames into all possible open reading frames (simple
codon stretches, actually) by highlighting the sequence title in the document window and choosing either
“Sorted Six-Frame Translation” or “Unsorted Six-Frame Translation”
Sorted: ORFs will be reported in order of start position. Negative-frame sequences are sorted according
to their end positions (first position along the positive sequence). The number of sequences which can be
translated and sorted is limited to something above 10,500 sequences. If a sorted translation becomes too
large, resources for storing the sequences to be sorted runs out. If this happens, BioEdit will tell you, then
present the sequences it was able to translate. Multiple sequences may be translated into a single ORF list
suitable for BLAST database creation.
Unsorted: Sequences are reported in the order that their stop codons are encountered in a once through,
6-frame simultaneous pass through the entire sequence. The codon stretches are written into a file as they
are encountered and therefore do not need to be stored in memory. Very long lists can thus be generated.
Currently, only one sequence at a time may be translated this way.
Experiment No 08:
Multiple sequence analysis using BioEdit
Multiple nucleotide sequence analysis:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1.fasta file
 Select all sequences.
 Click on accessory application → ClustalW Multiple alignment → Run ClustalW → Shade
identities and similarities in alignment window
P a g e | 28
Result:
Multiple nucleotide sequence analysis:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1P.fasta file.
 Select all sequences.
 Click on accessory application → ClustalW Multiple alignment → Run ClustalW → Shade
identities and similarities in alignment window
Result:
Interpretation:
Multiple Sequence Alignment (MSA) is generally the alignment of three or more biological sequences
(protein or nucleic acid) of similar length. From the output, homology can be inferred and the
evolutionary relationships between the sequences studied. ClustalW is a multiple sequence alignment tool
for the alignment of DNA or protein sequences. ClustalW calculates the best match for the input
sequences based on the parameters entered and generates an easy to interpret report. In previous
experiment we observed the alignment of more than two sequences for both nucleotide and protein
sequences. By the color shading we can find similarities an identities among the sequences.
P a g e | 29
Experiment No 09:
Tree Generation with MEGA.
i. Construct/test maximum likelihood Tree
Methods:
 Open BioEdit
 Click file → open
 Select All Files (*.*) from files of type
 Open ldh1P.fasta file.
 Select all sequences.
 Click on accessory application → ClustalW Multiple alignment → Run ClustalW
 Click on file → save as → select location → File name (ldh-P-aln.fasta) → save as type : Fasta
(*.fas, *.fst, *.fsa) → save
 Open MEGA 6
 Click file → open → ldh-P-aln.fasta → analyze → select protein sequences → Ok
 Click phylogeny →construct/test maximum likelihood Tree → Yes
 Test of Phylogeny (Bootstrap method) → No. of Bootstrap Replications (50) → substitution type
(amino acid) → Model/ method (dayhoff model) → rates among sites (Gamma distributed with
Invariant sites- G+1) → Compute
Result:
P a g e | 30
ii. Construct/ test neighbor joining tree
Methods:
 Click phylogeny →construct/ test neighbor joining tree → Yes
 Test of Phylogeny (Bootstrap method) → No. of Bootstrap Replications (50) → substitution type
(amino acid) → Model/ method (dayhoff model) → rates among sites (Gamma distributed- G) →
Compute
Result:
iii. Construct/ test minimum- evolution tree
Methods:
 Click phylogeny →construct/ test minimum- evolution tree → Yes
 Test of Phylogeny (Bootstrap method) → No. of Bootstrap Replications (50) → substitution type
(amino acid) → Model/ method (dayhoff model) → rates among sites (Gamma distributed- G) →
Compute
P a g e | 31
Result:
iv. Construct/ test UPGMA tree
Methods:
 Click phylogeny →construct/ test UPGMA tree → Yes
 Test of Phylogeny (Bootstrap method) → No. of Bootstrap Replications (50) → substitution type
(amino acid) → Model/ method (dayhoff model) → rates among sites (Gamma distributed- G) →
Compute
Result:
P a g e | 32
v. Construct/ test maximum parsimony tree
Methods:
 Click phylogeny →construct/ test UPGMA tree → Yes
 Test of Phylogeny (Bootstrap method) → No. of Bootstrap Replications (50) → substitution type
(amino acid) → Compute
Result:
Interpretation:
A phylogeny, or evolutionary tree, represents the evolutionary relationships among a set of organisms or
groups of organisms, called taxa (singular: taxon). The tips of the tree represent groups of descendent taxa
(often species) and the nodes on the tree represent the common ancestors of those descendants. Two
descendants that split from the same node are called sister groups. With molecular evolutionary genetics
analysis (MEGA) we constructed different types of phylogeny by inputing protein sequences.
Experiment No 10:
Working with single protein sequence: Analyzing protein composition
(pepdigest, pepstats), Protein secondary structure by mEmboss: (garnier for
protein secondary structure, helixturnhelix for motifs, pepcoil for coiled coil
regions.
i. Analyzing protein composition (pepstats)
Methods:
 Open mEMBOSS
 Click protein → composition → pepstats calculate statistics of protein properties
 In input section click on paste → cut and paste protein sequence → Go
P a g e | 33
Result:
Interpretation:
pepstats reads one or more protein sequences and writes an output file with various statistics on
the protein properties. This includes: molecular weight, number of residues, average residue
weight, charge, isoelectric point, for each type of amino acid: number, molar percent,
DayhoffStat, for each physico-chemical class of amino acid: number, molar percent; probability
of protein expression in E. coli inclusion bodies, molar extinction coefficient (A280), extinction
coefficient at 1 mg/ml (A280). In previous experiment we input a protein sequence and got these
data.
ii. Analyzing protein composition (pepdigest)- trypsin
Methods:
 Open mEMBOSS
 Click protein → composition → pepdigest report on protein proteolytic enzyme or reagent
cleavage sites
 In input section click on paste → cut and paste protein sequence
 In required section select trypsin → Go
P a g e | 34
Result:
Analyzing protein composition (pepdigest)- chymotrypsin
Methods:
 Open mEMBOSS
 Click protein → composition → pepdigest report on protein proteolytic enzyme or reagent
cleavage sites
 In input section click on paste → cut and paste protein sequence
 In required section select chymotrypsin → Go
Result:
P a g e | 35
Interpretation:
This programs allows to input one or more protein sequences and to specify one proteolytic
agent from a list, which might be a proteolytic enzyme or other reagent. It will then write a report
file containing the positions where the agent cuts, together with the peptides produced. The rest
of the file consists of columns holding the following data: start position of the fragment, end
position of the fragment, molecular weight of the fragment, residue before the cut site ('.' if start
of sequence), residue after the second cut site ('.' if end of sequence), sequence of the fragment.
In previous experiment we input a protein sequence and selected the proteolytic enzyme trypsin
and chymotrypsin and finally got these data as result.
iii.Protein secondary structure by mEmboss: ( garnier for protein secondary structure)
Methods:
 Open mEMBOSS
 Click protein → 2D STRUCTURE → garnier predict protein secondary structure using GOR
method
 In input section click on paste → cut and paste protein sequence → Go
Result:
Interpretation:
Garnier is an implementation of the original Garnier Osguthorpe Robson algorithm (GOR I) for
predicting protein secondary structure. It reads an input protein sequence and writes a standard EMBOSS
report file with the predicted secondary structure. The Garnier method is not regarded as the most
accurate prediction, but is simple to calculate on most workstations. In this experiment we input protein
sequence and got secondary structure.
P a g e | 36
iv.Protein secondary structure by mEmboss: helixturnhelix for motifs
Methods:
 Open mEMBOSS
 Click protein → 2D STRUCTURE → helixturnhelix identify nucleic acid-binding motifs in
protein sequences
 In input section click on paste → cut and paste any protein sequence → Go
Result:
Interpretation:
helixturnhelix uses the method of Dodd and Egan to identify helix-turn-helix nucleic acid binding motifs
in an input protein sequence. The output is a standard EMBOSS report file describing the location, size
and score of any putative motifs. For the sequence we input we found the output which identify nucleic
acid-binding motifs in protein sequences
Experiment No 11:
RNA structure prediction using RNAstructure
Methods:
 Open RNAstructure
 Click file→ new sequence
 Title- ldh1RNA → sequence (copy and paste 2 lines of sequences from ldh1.fasta) → fold as
RNA → yes → select location → file name - ldh1RNA → save → start → draw structures
 Draw → go to structure number/ zoom
P a g e | 37
Interpretation:
RNAstructure is a software package for RNA secondary structure prediction and analysis. It predicts
lowest free energy structures and low free energy structures either by using a heuristic or by determining
all possible low free energy structures. From this process we can find RNA secondary structure with
different energy level. The structure with lowest energy is more stable. To perform this process we should
use accurate maximum & minimum energy different, maximum number of structure, window size etc.

More Related Content

What's hot

Chou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure predictionChou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure predictionRoshan Karunarathna
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionArindam Ghosh
 
Tools for modeling protein 3 d structure
Tools for modeling protein 3 d structureTools for modeling protein 3 d structure
Tools for modeling protein 3 d structurePriyanka Kashyap
 
The uni prot knowledgebase
The uni prot knowledgebaseThe uni prot knowledgebase
The uni prot knowledgebaseKew Sama
 
Protien Structure Prediction
Protien Structure PredictionProtien Structure Prediction
Protien Structure PredictionSelimReza76
 
threading and homology modelling methods
threading and homology modelling methodsthreading and homology modelling methods
threading and homology modelling methodsmohammed muzammil
 
Dynamic programming and pairwise sequence alignment
Dynamic programming and pairwise sequence alignmentDynamic programming and pairwise sequence alignment
Dynamic programming and pairwise sequence alignmentGeethanjaliAnilkumar2
 
Structural Bioinformatics - Homology modeling & its Scope
Structural Bioinformatics - Homology modeling & its ScopeStructural Bioinformatics - Homology modeling & its Scope
Structural Bioinformatics - Homology modeling & its ScopeNixon Mendez
 
Needleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmNeedleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmProshantaShil
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary databaseKAUSHAL SAHU
 
levels of protein structure , Domains ,motifs & Folds in protein structure
levels of protein structure , Domains ,motifs & Folds in protein structurelevels of protein structure , Domains ,motifs & Folds in protein structure
levels of protein structure , Domains ,motifs & Folds in protein structureAaqib Naseer
 

What's hot (20)

Chou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure predictionChou fasman algorithm for protein structure prediction
Chou fasman algorithm for protein structure prediction
 
FASTA
FASTAFASTA
FASTA
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
 
Fasta
FastaFasta
Fasta
 
Tools for modeling protein 3 d structure
Tools for modeling protein 3 d structureTools for modeling protein 3 d structure
Tools for modeling protein 3 d structure
 
The uni prot knowledgebase
The uni prot knowledgebaseThe uni prot knowledgebase
The uni prot knowledgebase
 
Protien Structure Prediction
Protien Structure PredictionProtien Structure Prediction
Protien Structure Prediction
 
threading and homology modelling methods
threading and homology modelling methodsthreading and homology modelling methods
threading and homology modelling methods
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Dynamic programming and pairwise sequence alignment
Dynamic programming and pairwise sequence alignmentDynamic programming and pairwise sequence alignment
Dynamic programming and pairwise sequence alignment
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Structural Bioinformatics - Homology modeling & its Scope
Structural Bioinformatics - Homology modeling & its ScopeStructural Bioinformatics - Homology modeling & its Scope
Structural Bioinformatics - Homology modeling & its Scope
 
Database Searching
Database SearchingDatabase Searching
Database Searching
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Needleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmNeedleman-Wunsch Algorithm
Needleman-Wunsch Algorithm
 
Primary and secondary database
Primary and secondary databasePrimary and secondary database
Primary and secondary database
 
Phylogenetic data analysis
Phylogenetic data analysisPhylogenetic data analysis
Phylogenetic data analysis
 
MASCOT
MASCOTMASCOT
MASCOT
 
Protein Databases
Protein DatabasesProtein Databases
Protein Databases
 
levels of protein structure , Domains ,motifs & Folds in protein structure
levels of protein structure , Domains ,motifs & Folds in protein structurelevels of protein structure , Domains ,motifs & Folds in protein structure
levels of protein structure , Domains ,motifs & Folds in protein structure
 

Similar to Bioinfomatics laboratory

Bioinformatics complete manual
Bioinformatics complete manualBioinformatics complete manual
Bioinformatics complete manualFrazAhmadMazari
 
1PhylogeneticAnalysisHomeworkassignmentThisa.docx
1PhylogeneticAnalysisHomeworkassignmentThisa.docx1PhylogeneticAnalysisHomeworkassignmentThisa.docx
1PhylogeneticAnalysisHomeworkassignmentThisa.docxfelicidaddinwoodie
 
In-silico Drug designing
In-silico Drug designing In-silico Drug designing
In-silico Drug designing Vikas Sinhmar
 
Bioinfomatics Presentation
Bioinfomatics PresentationBioinfomatics Presentation
Bioinfomatics PresentationZhenhong Bao
 
Lesson 2:Internet Tool in life Sciences Research
Lesson 2:Internet Tool in life Sciences ResearchLesson 2:Internet Tool in life Sciences Research
Lesson 2:Internet Tool in life Sciences ResearchD. ALQahtani
 
Open Phenotypic Drug Discovery Resource poster
Open Phenotypic Drug Discovery Resource posterOpen Phenotypic Drug Discovery Resource poster
Open Phenotypic Drug Discovery Resource posterJeremy Yang
 
Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Alex Clark
 
one complete report from all the 4 labs.pdf
one complete report from all the 4 labs.pdfone complete report from all the 4 labs.pdf
one complete report from all the 4 labs.pdfstudy help
 
one complete report from all the 4 labs.pdf
one complete report from all the 4 labs.pdfone complete report from all the 4 labs.pdf
one complete report from all the 4 labs.pdfstudy help
 
Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...Greg Landrum
 
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...Alex Clark
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrUSD Bioinformatics
 
FindMod
FindModFindMod
FindModSobia
 
A Natural Language Processing Approach to Reviewing Research Abstracts
A Natural Language Processing Approach to Reviewing Research AbstractsA Natural Language Processing Approach to Reviewing Research Abstracts
A Natural Language Processing Approach to Reviewing Research AbstractsRobert Songer
 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentChris Southan
 
Chemistry Resources Science Teachers
Chemistry Resources Science TeachersChemistry Resources Science Teachers
Chemistry Resources Science TeachersMary Markland
 

Similar to Bioinfomatics laboratory (20)

Bioinformatics complete manual
Bioinformatics complete manualBioinformatics complete manual
Bioinformatics complete manual
 
1PhylogeneticAnalysisHomeworkassignmentThisa.docx
1PhylogeneticAnalysisHomeworkassignmentThisa.docx1PhylogeneticAnalysisHomeworkassignmentThisa.docx
1PhylogeneticAnalysisHomeworkassignmentThisa.docx
 
In-silico Drug designing
In-silico Drug designing In-silico Drug designing
In-silico Drug designing
 
Bioinfomatics Presentation
Bioinfomatics PresentationBioinfomatics Presentation
Bioinfomatics Presentation
 
Advanced PubMed (Productivity & Efficiency): Professional & Clinical Informat...
Advanced PubMed (Productivity & Efficiency): Professional & Clinical Informat...Advanced PubMed (Productivity & Efficiency): Professional & Clinical Informat...
Advanced PubMed (Productivity & Efficiency): Professional & Clinical Informat...
 
Lesson 2:Internet Tool in life Sciences Research
Lesson 2:Internet Tool in life Sciences ResearchLesson 2:Internet Tool in life Sciences Research
Lesson 2:Internet Tool in life Sciences Research
 
Open Phenotypic Drug Discovery Resource poster
Open Phenotypic Drug Discovery Resource posterOpen Phenotypic Drug Discovery Resource poster
Open Phenotypic Drug Discovery Resource poster
 
User manual
User manualUser manual
User manual
 
Article
ArticleArticle
Article
 
Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...
 
one complete report from all the 4 labs.pdf
one complete report from all the 4 labs.pdfone complete report from all the 4 labs.pdf
one complete report from all the 4 labs.pdf
 
one complete report from all the 4 labs.pdf
one complete report from all the 4 labs.pdfone complete report from all the 4 labs.pdf
one complete report from all the 4 labs.pdf
 
Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...Reproducibility in cheminformatics and computational chemistry research: cert...
Reproducibility in cheminformatics and computational chemistry research: cert...
 
GtoPdb ELIXIR-All Hands 2018
GtoPdb ELIXIR-All Hands 2018GtoPdb ELIXIR-All Hands 2018
GtoPdb ELIXIR-All Hands 2018
 
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
CDD BioAssay Express: Expanding the target dimension: How to visualize a lot ...
 
Session ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corrSession ii g1 lab genomics and gene expression mmc-corr
Session ii g1 lab genomics and gene expression mmc-corr
 
FindMod
FindModFindMod
FindMod
 
A Natural Language Processing Approach to Reviewing Research Abstracts
A Natural Language Processing Approach to Reviewing Research AbstractsA Natural Language Processing Approach to Reviewing Research Abstracts
A Natural Language Processing Approach to Reviewing Research Abstracts
 
In silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug DevelopmentIn silico 360 Analysis for Drug Development
In silico 360 Analysis for Drug Development
 
Chemistry Resources Science Teachers
Chemistry Resources Science TeachersChemistry Resources Science Teachers
Chemistry Resources Science Teachers
 

Recently uploaded

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxAvyJaneVismanos
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonJericReyAuditor
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 

Recently uploaded (20)

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Final demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptxFinal demo Grade 9 for demo Plan dessert.pptx
Final demo Grade 9 for demo Plan dessert.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lesson
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 

Bioinfomatics laboratory

  • 1. P a g e | 1 BGE 313 Bioinformatics Laboratory Submitted To, Submitted By, Dr. Siraje Arif Mahmud Effat Jahan Tamanna Associate Professor Roll No: 131647 Dept. of Biotechnology and Genetic Engineering Reg No: 36401 Jahangirnagar University Session: 2012 – 13
  • 2. P a g e | 2 INDEX Serial No Date Name of the Experiment Page Remarks 01 21.04.2016 Searching basics, AND, OR, NOT, “keywords together”, * 3 – 8 02 21.04.2016 Searching PMC and PubMed Using Authors name, fields, limits 8 – 10 03 02.05.2016 Retrieving protein sequences using UniProt and creating multi-fasta files 11 04 02.05.2016 Retrieving relevant DNA sequences using nucleotide and creating multi fasta-file. (search by ldh1 NOT hypothetical) 12 05 02.05.2016 Performing DNA and protein BLAST and analyzing result 13 – 16 06 03.05.2016 Pairwise alignment (global, end gap free), calculate identities, dotplot using BioEdit 17 – 20 07 03.05.2016 Nucleotide composition, complement, reverse complement, DNA to RNA, translate, restriction map, six frame translation using BioEdit 21 – 27 08 03.05.2016 Multiple sequence analysis using BioEdit 27 – 28 09 03.05.2016 Tree Generation with MEGA 29 – 32 10 09.05.2016 Working with single protein sequence: Analyzing protein composition (pepdigest, pepstats), Protein secondary structure by mEmboss: (garnier for protein secondary structure), helixturnhelix for motifs, pepcoil for coiled coil regions 34 – 36 11 09.05-2016 RNA structure prediction using RNAstructure 36 – 37
  • 3. P a g e | 3 Experiment No 01 Searching basics, AND, OR, NOT, “keywords together”, * i. Searching Basics Methods:  Open PubMed home page  In PubMed search box write alpha amylase and click search. 10639 results will be shown.  To filter the result click on free full text from text availability section. Results will be reduced into 3108 in number.  Then click on 5 years from Publication Dates section. Rest will be reduced into 792 in number.  Click on Review from Article type section. Rest will be reduced into 16 in number.  If we want to clear filter, we have to click clear on the right side of all filter type, or clear all. Result: Interpretation: PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. PubMed is maintained and updated by the National Library of Medicine on a weekly basis. A search on alpha amylase shows results all articles related to alpha amylase. If we want to filer results we can click on review which will show reviewed articles. Full free text will reduce result by filtering free articles and 5 years will reduce results by showing articles which were published in previous 5 years.
  • 4. P a g e | 4 ii. Boolean Operator Using a. AND Methods:  Open PubMed home page  Write gyrase AND topoisomerase and search. 1986 results are found.  Then filtering. Click Free full text, Review and 5 years. Results will be reduced into 1032, 31 and 6. Result: Interpretation: AND requires both terms to be in each item returned. If one is contained in the document and the other is not, the item is not included in the resulting list (narrows the search). A search on gyrase AND topoisomerase includes results that all articles will include both keywords. b. OR Methods:  Open PubMed home page  Search antibody OR immunoglobulin.  1247231 results are found.  Now filtering. Free full text – 363108, Review - 17602 , 5 years – 7068
  • 5. P a g e | 5 Result: Interpretation: Either term (or both) will be in the returned document (Broadens the search). Search on antibody OR immunoglobulin includes results contains that the articles containing the word antibody (but not immunoglobulin) and other articles containing the word immunoglobulin (but not antibody) as well as articles with antibody OR immunoglobulin in either order or number of uses. c. NOT Methods:  Open PubMed home page  Search immunoglobulin NOT IgG NOT IgA NOT IgM  693211 results are found.  Now filtering  Free full text - 181743  Review - 10047  5 years – 3618
  • 6. P a g e | 6 Result: Interpretation: When the first term is searched, then any records containing the term after the operators are subtracted from the results. A search on immunoglobulin NOT IgG NOT IgA NOT IgM includes results contains that the articles about immunoglobulin will exclude IgG, IgA and IgM. iii. Inverted (“ ”) search Methods:  Open PubMed home page  Search “alpha amylase”  8015 results are found  Now Filtering  Free full text – 2393  Review – 25  5 years – 12
  • 7. P a g e | 7 Result: Interpretation: When any term is searched, then any records containing the term exactly will be shown. A search on “alpha amylase” includes results contains that the articles will contain the word exactly and the result will be more specific. iv. * search Methods:  Open PubMed home page  Search ldh*  25235 results are found  Now filtering  Free full text – 5691  Review – 96  5 years – 46
  • 8. P a g e | 8 Result: Interpretation: When any term is searched with *, then the records containing all the subclasses of the term exactly will be shown. A search on ldh* includes results contains that the articles will contain all the subclasses of ldh. Experiment No 02: Searching PMC and PubMed Using Authors name, fields, limits i. Searching PubMed using author name Methods:  Open PubMed home page  Write Schilling CH (1999) in search box and search.  2 results are found
  • 9. P a g e | 9 Result: Interpretation: PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. PubMed is maintained and updated by the National Library of Medicine on a weekly basis. Search on Schilling CH (1999) shows the result contain the free article by the author Schilling CH in 1999. ii. Searching PMC using author name Methods:  Open PubMed home page  Select PMC.  Write Schilling CH in search box and search.  9 results are found
  • 10. P a g e | 10 Result: Interpretation: PubMed Central is a free digital archive of articles, accessible to anyone from anywhere via a basic web browser. The full text of all PubMed Central articles is free to read, with varying provisions for reuse. Search on Schilling CH shows the result containing the free article by the author Schilling CH.
  • 11. P a g e | 11 Experiment No 03: Retrieving protein sequences using UniProt and creating multi-fasta files Methods:  Open UniProt home page.  Search by writing ldh1.  Filter this by clicking Reviewed (893) from filter by option.  In left side in the box of other organisms write lactobacillus and click go. 16 results will be shown.  Now click on the box of Entry for selecting all.  Now click on download → uncompressed → go  Select all sequences. (Ctrl + A)  Copy all sequences. (Ctrl + C)  Open Notepad.  Paste All Sequences (Ctrl + V)  Now save these sequences. Click file → save as ( Ctrl + S) → select location → write ldh1P.fasta in file name → click save Result: Interpretation: UniProt is the Universal Protein resource, a central repository of protein data created by combining the Swiss-Prot, TrEMBL and PIR-PSD databases. We can search any protein sequence from UniProt. We can create multi fasta files and save them in notepad. We can further use this fasta files when we need them.
  • 12. P a g e | 12 Experiment No: 4 Retrieving relevant DNA sequences using nucleotide and creating multi fasta- file. (Search by ldh1 NOT hypothetical) Methods:  Open PubMed Home Page.  Select nucleotide.  Search by writing ldh1 NOT hypothetical. 51 results will be found.  Select Number 4, 5, 6, 9.  Select on the right arrow of Summary. From here select FASTA (text).  Select all sequences (Ctrl + A).  Copy all sequences (Ctrl + C).  Open Notepad.  Paste All Sequences (Ctrl + V)  Now save these sequences. Click file → save as ( Ctrl + S) → select location → write ldh1.fasta in file name → click save Result: Interpretation: We can search any nucleotide sequence from PubMed. We can create multi fasta files and save them in notepad. We can further use this fasta files when we need them.
  • 13. P a g e | 13 Experiment No: 05 Performing DNA and protein BLAST and analyzing result i. Blastn Methods:  Open blast home page.  Select nucleotide blast  Copy a nucleotide sequence from previously saved ldh1.fasta file.  Paste the sequence into Enter accession number(s), gi(s), or FASTA sequences(s) box.  From Choose Search Set options select nucleotide collection (nr/nt).  From Algorithm parameters select: Max target sequences (50)→ Expect threshold (0.1)  Then click show results in new window and click BLAST. Methods: ii. Blastp Methods:  Open blast home page.  Select nucleotide blast.  Select blastp.  Copy a protein sequence from previously saved ldh1P.fasta file.  Paste the sequence into Enter accession number(s), gi(s), or FASTA sequences(s) box.  From Choose Search Set options select non-redundant protein sequences (nr).  From Algorithm parameters select: Max target sequences (50)→ Expect threshold (0.1)  Then click show results in new window and click BLAST.
  • 14. P a g e | 14 Results: iii. blastx Methods:  Open blast home page.  Select nucleotide blast.  Select blastx.  Copy a nucleotide sequence from previously saved ldh1.fasta file.  Paste the sequence into Enter accession number(s), gi(s), or FASTA sequences(s) box.  From Choose Search Set options select non-redundant protein sequences (nr).  From Algorithm parameters select: Max target sequences (50) → Expect threshold (0.1).  Then click show results in new window and click BLAST. Results:
  • 15. P a g e | 15 iv. tlastn Methods:  Open blast home page.  Select “nucleotide blast”  Select “tblastn”.  Copy a protein sequence from previously saved “ldh1P.fasta” file.  Paste the sequence into “Enter accession number(s), gi(s), or FASTA sequences(s)” box.  From “Choose Search Set” options select “nucleotide collection (nr/nt)”.  From “Algorithm parameters” select: Max target sequences (50)→ Expect threshold (0.1)  Then click “show results in new window” and click “BLAST” Results:
  • 16. P a g e | 16 v. tlastx Methods:  Open blast home page.  Select “nucleotide blast”  Select “tblastx”.  Copy a nucleotide sequence from previously saved “ldh1.fasta” file.  Paste the sequence into “Enter accession number(s), gi(s), or FASTA sequences(s)” box.  From “Choose Search Set” options select “nucleotide collection (nr/nt)”.  From “Algorithm parameters” select: Max target sequences (50)→ Expect threshold (0.1)  Then click “show results in new window” and click “BLAST” Interpretation: The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. blastn- Search a nucleotide database using a nucleotide query, blastp- Search protein database using a protein query, blastx- Search protein database using a translated nucleotide query, tblast- Search translated nucleotide database using a protein query, tblastx- Search translated nucleotide database using a translated nucleotide query. During the result of blast, in graphic summary the red portion indicates the query coverage. From blast result we can download fasta files.
  • 17. P a g e | 17 Experiment No- 06: Pairwise alignment (global, end gap free), calculate identities, dotplot using BioEdit i. Pairwise alignment- global Methods:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1.fasta file.  Select two sequences.  Click on sequence → pairwise alignment → align two sequences (optimal GLOBAL alignment)  Can click “Shade identities and similarities in alignment window” for color shade or “normal view” mode for normal view. Results: Interpretation: Global alignments, which attempt to align every residue in every sequence, are most useful when the sequences in the query set are similar and of roughly equal size. In BioEdit we aligned to sequences globally. By color shading we noticed identities and similarities in alignment window.
  • 18. P a g e | 18 ii. Pairwise alignment- end gap free Methods:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1.fasta file.  Select two sequences.  Click on sequence → pairwise alignment → align two sequence (allow ends to slide)  Can click “Shade identities and similarities in alignment window” for color shade or “normal view” mode for normal view. Results: Interpretation: In end gap free alignment Gaps that appear before the first or after the last letter of the sequence are for. Especially preferable whenever one of the sequences is significantly shorter than the other. We used two sequences to align end gap free. By color shading we observed identities and similarities in alignment window.
  • 19. P a g e | 19 iii. Calculate identities Methods:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1.fasta file.  Select two sequences.  Click on sequence → pairwise alignment → calculate identity/similarity for two sequences. Interpretation: Sequence identity is the amount of characters which match exactly between two different sequences. Hereby, gaps are not counted and the measurement is relational to the shorter of the two sequences. In this experiment, we found identities between two sequences by BioEdit. iiv. Dotplot Methods:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1.fasta file.  Select two sequences.  Click on sequence → pairwise alignment → Dot plot (Pairwise comparison)
  • 20. P a g e | 20 Result: Window – 20, mismatch limit - 10 Interpretation: Dot plot is a graphical method that allows the comparison of two sequences and identify regions of close similarity between them. The convenience of using dot-plot analysis is that the one graphics shows all significant pairwise alignments simultaneously. We constructed a dot plot window using BioEdit. We can see a lot of dots along a diagonal line, which indicates that the two protein sequences contain many identical amino acids at the same (or very similar) positions along their lengths. This is what we would expect, because we know that these two proteins are homologues (related proteins).
  • 21. P a g e | 21 Experiment No – 07 Nucleotide composition, complement, reverse complement, DNA to RNA, translate, restriction map, six frame translation using BioEdit. i. Nucleotide Composition Methods:  Open BioBdit  Click file → open → Select All Files (*.*) from files of type  Open ldh1.fasta file  Select one sequence  Click on sequence → nucleic acid → nucleotide composition Result: Interpretation: Nucleotide composition summaries and plots may be obtained by choosing “Nucleotide Composition” form the “Nucleic Acid” submenu of the “Sequence” menu, respectively. Bar plots show the Molar percent of each residue in the sequence. For nucleic acids, degenerate nucleotide designations are added to the plot if and as they are encountered. For example, a sequence that has only A, G, C and T will have four bars on the graph. We can observe molecular weight, A+T content, G+C content.
  • 22. P a g e | 22 ii. Complement Methods:  Open BioEdit.  Click file → open.  Select All Files (*.*) from files of type.  Open ldh1.fasta file.  Select one sequence.  Click on sequence → nucleic acid → complement.  For undo select the sequence →Click on sequence → nucleic acid → complement. Result: Interpretation: We can get the complement sequence of the given sequence. If we want to get back the previous sequence, we have to complement the sequence again. iii. Reverse Complement Methods:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1.fasta file  Select one sequence.  Click on sequence → nucleic acid → reverse complement  For undo select the sequence →Click on sequence → nucleic acid → reverse complement
  • 23. P a g e | 23 Results: Interpretation: We can get the reverse complement sequence of the given sequence. If we want to get back the previous sequence, we have to reverse complement the sequence again. iv. DNA to RNA Methods:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1.fasta file.  Select one sequence.  Click on sequence → nucleic acid → DNA - > RNA  For undo select the sequence →Click on sequence → nucleic acid → RNA -> DNA Result: Interpretation: DNA sequence converts into RNA sequence. In RNA sequence there is no thymine (T), Instead of thymine there is Uracil (U).
  • 24. P a g e | 24 v. Translate Methods:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1.fasta file.  Select one sequence.  Click on sequence → nucleic acid → translate → frame 1, frame 2, frame 3  To get the remaining 3 frames select the sequence →nucleic acid → reverse complement → again select sequence → nucleic acid → translate → frame 1, frame 2, frame 3 Result:
  • 25. P a g e | 25 Interpretation: We know that there are six frames. Three are forward frames and three are reverse frames. Selecting a sequence and clicking by frame 1, frame 2, frame 3 we can get all forward frames. We can also observe that every three nucleotides code which amino acid. By reverse complement of a sequence we can get remaining three reverse frame and every three nucleotides code the specific amino acid. In the experiment we got 3 forward and 3 reverse frames of a selected sequence. vi. Restriction Map Methods:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1.fasta file.  Select one sequence.  Click on sequence → nucleic acid → restriction map → cancel enzyme with degenerate recognition and large recognition sites → select all enzymes from manufacturer → select circular DNA (ends joint) → generate map Results:
  • 26. P a g e | 26 Interpretation: A restriction map is a map of known restriction sites within a sequence of DNA. Restriction mapping requires the use of restriction enzymes. Restriction Map accepts a DNA sequence and returns a textual map showing the positions of restriction endonuclease cut sites. From this map which we found in experiment show different restriction site which are cut by different restriction enzyme. vii. Six frame translation Sorted six frame translation:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1.fasta file.  Select one sequence.  Click on sequence → nucleic acid → sorted six frame translation → minimum OFR size 40→ start codon ATG → translate Result: Unsorted six frame translation:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1.fasta file.  Select one sequence.  Click on sequence → nucleic acid → unsorted six frame translation → minimum OFR size 40 → start codon ATG → translate
  • 27. P a g e | 27 Interpretation: A DNA sequence may be translated in all six reading frames into all possible open reading frames (simple codon stretches, actually) by highlighting the sequence title in the document window and choosing either “Sorted Six-Frame Translation” or “Unsorted Six-Frame Translation” Sorted: ORFs will be reported in order of start position. Negative-frame sequences are sorted according to their end positions (first position along the positive sequence). The number of sequences which can be translated and sorted is limited to something above 10,500 sequences. If a sorted translation becomes too large, resources for storing the sequences to be sorted runs out. If this happens, BioEdit will tell you, then present the sequences it was able to translate. Multiple sequences may be translated into a single ORF list suitable for BLAST database creation. Unsorted: Sequences are reported in the order that their stop codons are encountered in a once through, 6-frame simultaneous pass through the entire sequence. The codon stretches are written into a file as they are encountered and therefore do not need to be stored in memory. Very long lists can thus be generated. Currently, only one sequence at a time may be translated this way. Experiment No 08: Multiple sequence analysis using BioEdit Multiple nucleotide sequence analysis:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1.fasta file  Select all sequences.  Click on accessory application → ClustalW Multiple alignment → Run ClustalW → Shade identities and similarities in alignment window
  • 28. P a g e | 28 Result: Multiple nucleotide sequence analysis:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1P.fasta file.  Select all sequences.  Click on accessory application → ClustalW Multiple alignment → Run ClustalW → Shade identities and similarities in alignment window Result: Interpretation: Multiple Sequence Alignment (MSA) is generally the alignment of three or more biological sequences (protein or nucleic acid) of similar length. From the output, homology can be inferred and the evolutionary relationships between the sequences studied. ClustalW is a multiple sequence alignment tool for the alignment of DNA or protein sequences. ClustalW calculates the best match for the input sequences based on the parameters entered and generates an easy to interpret report. In previous experiment we observed the alignment of more than two sequences for both nucleotide and protein sequences. By the color shading we can find similarities an identities among the sequences.
  • 29. P a g e | 29 Experiment No 09: Tree Generation with MEGA. i. Construct/test maximum likelihood Tree Methods:  Open BioEdit  Click file → open  Select All Files (*.*) from files of type  Open ldh1P.fasta file.  Select all sequences.  Click on accessory application → ClustalW Multiple alignment → Run ClustalW  Click on file → save as → select location → File name (ldh-P-aln.fasta) → save as type : Fasta (*.fas, *.fst, *.fsa) → save  Open MEGA 6  Click file → open → ldh-P-aln.fasta → analyze → select protein sequences → Ok  Click phylogeny →construct/test maximum likelihood Tree → Yes  Test of Phylogeny (Bootstrap method) → No. of Bootstrap Replications (50) → substitution type (amino acid) → Model/ method (dayhoff model) → rates among sites (Gamma distributed with Invariant sites- G+1) → Compute Result:
  • 30. P a g e | 30 ii. Construct/ test neighbor joining tree Methods:  Click phylogeny →construct/ test neighbor joining tree → Yes  Test of Phylogeny (Bootstrap method) → No. of Bootstrap Replications (50) → substitution type (amino acid) → Model/ method (dayhoff model) → rates among sites (Gamma distributed- G) → Compute Result: iii. Construct/ test minimum- evolution tree Methods:  Click phylogeny →construct/ test minimum- evolution tree → Yes  Test of Phylogeny (Bootstrap method) → No. of Bootstrap Replications (50) → substitution type (amino acid) → Model/ method (dayhoff model) → rates among sites (Gamma distributed- G) → Compute
  • 31. P a g e | 31 Result: iv. Construct/ test UPGMA tree Methods:  Click phylogeny →construct/ test UPGMA tree → Yes  Test of Phylogeny (Bootstrap method) → No. of Bootstrap Replications (50) → substitution type (amino acid) → Model/ method (dayhoff model) → rates among sites (Gamma distributed- G) → Compute Result:
  • 32. P a g e | 32 v. Construct/ test maximum parsimony tree Methods:  Click phylogeny →construct/ test UPGMA tree → Yes  Test of Phylogeny (Bootstrap method) → No. of Bootstrap Replications (50) → substitution type (amino acid) → Compute Result: Interpretation: A phylogeny, or evolutionary tree, represents the evolutionary relationships among a set of organisms or groups of organisms, called taxa (singular: taxon). The tips of the tree represent groups of descendent taxa (often species) and the nodes on the tree represent the common ancestors of those descendants. Two descendants that split from the same node are called sister groups. With molecular evolutionary genetics analysis (MEGA) we constructed different types of phylogeny by inputing protein sequences. Experiment No 10: Working with single protein sequence: Analyzing protein composition (pepdigest, pepstats), Protein secondary structure by mEmboss: (garnier for protein secondary structure, helixturnhelix for motifs, pepcoil for coiled coil regions. i. Analyzing protein composition (pepstats) Methods:  Open mEMBOSS  Click protein → composition → pepstats calculate statistics of protein properties  In input section click on paste → cut and paste protein sequence → Go
  • 33. P a g e | 33 Result: Interpretation: pepstats reads one or more protein sequences and writes an output file with various statistics on the protein properties. This includes: molecular weight, number of residues, average residue weight, charge, isoelectric point, for each type of amino acid: number, molar percent, DayhoffStat, for each physico-chemical class of amino acid: number, molar percent; probability of protein expression in E. coli inclusion bodies, molar extinction coefficient (A280), extinction coefficient at 1 mg/ml (A280). In previous experiment we input a protein sequence and got these data. ii. Analyzing protein composition (pepdigest)- trypsin Methods:  Open mEMBOSS  Click protein → composition → pepdigest report on protein proteolytic enzyme or reagent cleavage sites  In input section click on paste → cut and paste protein sequence  In required section select trypsin → Go
  • 34. P a g e | 34 Result: Analyzing protein composition (pepdigest)- chymotrypsin Methods:  Open mEMBOSS  Click protein → composition → pepdigest report on protein proteolytic enzyme or reagent cleavage sites  In input section click on paste → cut and paste protein sequence  In required section select chymotrypsin → Go Result:
  • 35. P a g e | 35 Interpretation: This programs allows to input one or more protein sequences and to specify one proteolytic agent from a list, which might be a proteolytic enzyme or other reagent. It will then write a report file containing the positions where the agent cuts, together with the peptides produced. The rest of the file consists of columns holding the following data: start position of the fragment, end position of the fragment, molecular weight of the fragment, residue before the cut site ('.' if start of sequence), residue after the second cut site ('.' if end of sequence), sequence of the fragment. In previous experiment we input a protein sequence and selected the proteolytic enzyme trypsin and chymotrypsin and finally got these data as result. iii.Protein secondary structure by mEmboss: ( garnier for protein secondary structure) Methods:  Open mEMBOSS  Click protein → 2D STRUCTURE → garnier predict protein secondary structure using GOR method  In input section click on paste → cut and paste protein sequence → Go Result: Interpretation: Garnier is an implementation of the original Garnier Osguthorpe Robson algorithm (GOR I) for predicting protein secondary structure. It reads an input protein sequence and writes a standard EMBOSS report file with the predicted secondary structure. The Garnier method is not regarded as the most accurate prediction, but is simple to calculate on most workstations. In this experiment we input protein sequence and got secondary structure.
  • 36. P a g e | 36 iv.Protein secondary structure by mEmboss: helixturnhelix for motifs Methods:  Open mEMBOSS  Click protein → 2D STRUCTURE → helixturnhelix identify nucleic acid-binding motifs in protein sequences  In input section click on paste → cut and paste any protein sequence → Go Result: Interpretation: helixturnhelix uses the method of Dodd and Egan to identify helix-turn-helix nucleic acid binding motifs in an input protein sequence. The output is a standard EMBOSS report file describing the location, size and score of any putative motifs. For the sequence we input we found the output which identify nucleic acid-binding motifs in protein sequences Experiment No 11: RNA structure prediction using RNAstructure Methods:  Open RNAstructure  Click file→ new sequence  Title- ldh1RNA → sequence (copy and paste 2 lines of sequences from ldh1.fasta) → fold as RNA → yes → select location → file name - ldh1RNA → save → start → draw structures  Draw → go to structure number/ zoom
  • 37. P a g e | 37 Interpretation: RNAstructure is a software package for RNA secondary structure prediction and analysis. It predicts lowest free energy structures and low free energy structures either by using a heuristic or by determining all possible low free energy structures. From this process we can find RNA secondary structure with different energy level. The structure with lowest energy is more stable. To perform this process we should use accurate maximum & minimum energy different, maximum number of structure, window size etc.